Disaster Recovery International Style
- Published on Tuesday, 30 October 2007 08:26
At midnight, December 31, 1992, the entire economic structure of Europe will change dramatically. At that time the European Community (EC) will unite into a single common market. This unprecedented act where currency, information, goods, people, and services may move freely among the 12 EC member countries will alter the way U.S. companies (with foreign interests) plan for disaster recovery. The EC member countries include the following:
- The Netherlands
- The United Kingdom
- West Germany
Since the Single European Act was adopted in 1987, many U.S. companies have already begun moving their information processing activities closer to their foreign customers. Notwithstanding the most obvious reasons for moving their IS operations, e.g., cost, U.S. companies have one additional compelling reason, a need for uninterrupted business operations. Additionally, attempting to support subsidiaries in Europe from the U.S. may require very expensive satellite communications.
Companies moving their IS operations will face many challenges - lack of systems standards, different hardware and software support, telecommunication networks that predate World War II, and a lack of standard network protocols.
The need for disaster recovery has long been recognized in Europe, especially in the United Kingdom and West Germany. In fact, both these countries have had commercial disaster recovery centers available since the late 1970’s. Not only are Europeans less forgiving of errors and more demanding of quality and reliability, but the growing number of cases of disasters throughout the continent is alarming. This article examines the types of disasters that have affected international data centers in the hope that it will alert U.S. companies moving their IS operations overseas to the risks involved. Although many of the risks are similar to those in the U.S., others such as terrorism are not.
Even though the sample size is relatively small (statistically speaking), we believe that the data supports our position that risks abound abroad as well as in the U.S.
ANALYSIS OF DISASTERS BY TYPE
Fire shares the spotlight with terrorism as the single largest source (17.5%) of disasters internationally. This compares to 13.2% in the U.S.. Of the case files where fire was the cause, the resulting damage was the most extensive. In virtually all instances, the data center and the building itself were totally lost. In one particular case, The National Bank of Australia lost an $8 million IBM 3090 processor just weeks after it was installed.
Terrorism is one risk event that most U.S. IS managers have not had to deal with. In fact, terrorism only accounts for 3.4% of computer-related disasters in the U.S.. Unfortunately, the occurrence of terrorism internationally has accounted for as many incidents as fire. However, the dollar loss resulting from acts of terrorism are far in excess of any other type of disaster incident. A favorite target of the terrorists has been the service industry, in particular computer companies. However, terrorists have not limited their activities to this industry sector, evidenced by the bombing of Interpol Headquarters in London this past year.
The fact that hurricanes and tornadoes rank as high (14.0%) as they do is a bit of a statistical anomaly. If the Caribbean countries were left out of the international disaster score board, the percentage would be low. All the case files for hurricanes and tornadoes have originated in the Caribbean countries and Canada. In all cases the damage to the actual facility has been light, however, the resulting loss of power caused extensive outages. Although hurricanes and tornadoes can strike the continent of Europe, they are rare. The greatest risk for these types of storms are in Australia, the Caribbean countries, Japan, and Mexico.
Clearly the worst place to locate a data center internationally in terms of earthquake risk would be Mexico or the Philippines. Not only has substantial damage been caused to IS equipment itself, but IS staff’s experienced heavy loss of life and injury. In these countries where trained personnel are at a premium, loss of critical IS personnel is more devastating than the loss of equipment or vital records. In one particular case, a seemingly mild earthquake in Mexico City caused the loss of six senior IS personnel at a bank. The occurrence of earthquakes affecting data centers internationally is lower in the U.S. (12.8% U.S. versus 10.0% internationally). However, the actual occurrence of earthquakes is much higher internationally. The reason for the disparity is due to the heavy concentration of data centers near major fault lines in the U.S.
Surprisingly, power outages only account for 9.5% of international disasters versus 15.1% in the U.S.. This is primarily due to the wide acceptance of uninterruptable power systems (UPS) internationally. Reliable power has been more of a problem internationally than in the U.S. subsequently, companies have done much to insulate themselves from power problems.
Software errors resulting in extended down time are more pervasive internationally. Internationally, the occurrence is 8.8% versus 3.3% in the U.S. This fact is predominately due to the fact that there are less technically oriented IS personnel available internationally. In the case files we reviewed, the initial software error was relatively minor, however, a series of failed recovery efforts made them progressively worse. An argument could be made that the disaster event was caused by human error and not a software error, however, CPR classifies disasters by the event that triggered the disaster.
Canadian and Australian data centers suffer more from flooding conditions than anywhere else. The incidents of flood related disasters internationally is virtually the same as in the U.S. (7.0%). In most cases, the flooding could have been avoided by proper drainage. CPR research suggests a universal problem with municipal drainage systems in those countries.
Hardware error related events account for approximately the same percentage of disasters in the U.S. and abroad. However, their causes are very different. Internationally, there are many types of computer systems in use that are either discontinued models from OEM vendors or very old. This fact is the primary reason for the occurrence of hardware errors. In the U.S., most case files relating to hardware errors have almost the opposite cause. In the U.S. case files, most hardware problems result from leading edge technology where the user was one of the first to install a particular piece of equipment. This was the case with a major U.S. company that lost 165 IBM 3380 HDA’s over the course of two weeks.
Burst pipes accounted for 3.5% of the data center outages internationally. However, in most cases the disasters were avoidable. Case in point, during an extremely cold winter, an IS manager decided to cut the heat off to a lights-out operations center to save on electricity. However, he forgot about the sprinkler system, which subsequently burst and flooded the entire data center.
Like the U.S., network outages account for a fairly low percentage of occurrences of data center disasters (3.5%). This low number is primarily due to the difficultly in uncovering network-related outages and the short duration of these types of outages. Internationally, however, CPR sees a trend where the incidents of network outages may increase as EC comes closer. The significant amount of new construction and the wide deployment of networks could make network outages more common throughout Europe.
Analyzing the causes of disasters internationally should give IS managers a sense for the trouble areas to avoid and realize that disasters are not confined to the U.S.
Both domestic and international users and IS management should be involved in the preparation and implementation of effective business recovery plans.
Tari Schreider is the President of Contingency Planning Research, Inc., a four-year old disaster recovery consultancy. He has responsibilities for contingency planning and disaster recovery strategies, new technology development, and risk analysis.
This article adapted from Vol. 3 No. 2, p. 14.