Spring World 2015

Conference & Exhibit

Attend The #1 BC/DR Event!

Fall Journal

Volume 27, Issue 4

Full Contents Now Available!

Planning International Style

Planning International Style (6)

October 30, 2007

International Disaster Recovery Planning

Written by

The trademark of the 90s--sophisticated technologies that enable a new ease of communications, both nationally and internationally--combined with the consistent expansion of companies has created a stronger link among businesses and nations throughout the world. Furthermore, the creation of a new European common market in 1992 will also profoundly affect and transform the way in which we conduct business, and it is apt to increase our direct involvement in foreign business affairs. It may soon be insufficient to plan for a disaster that affects only your company if you have vested interests abroad. This survey should give you some idea of the state of the disaster recovery industry in a variety of countries as well as the levels of involvement of several businesses.

Australia

Submitted by Wayne Lewis, CDRP, a disaster recovery consultant with the largest bank in the Pacific region:

Disaster Recovery Planning in Australia is still very much in its infancy, but gaining momentum each year. Its development basically was impacted by lack of available education, supported methodologies and distance from those countries which are advanced in this field.

The major banks in the mid-80s were perhaps the first to realize the necessity of being able to fully recover applications in a timely manner, and began to dabble in this field.

Since that time, realization of its importance has been growing--in the late 80s, Government departments (both Federal and State), service organizations, and manufacturing companies began to realize that an interruption to their services would not be tolerated for a long period of time by their customers.

The need, acceptance and promotion of disaster recovery, its principles, and its discipline even today are not completely accepted by some Australian management. However, the overall trend is that management is realizing that DR is not a task that can be done when there are a few spare hours.

DR in Australia mostly focuses on the repercussions of DP interruption or withdrawal (especially when an unplanned incident may have recently occurred) rather than examining DR from a variety of angles. While it is important to secure DP services, they are of little use if your clients cannot access their work place to use the equipment or services.

Strategy development in this discipline requires factual information. Armed with such information, one can then jettison the piecemeal or knee-jerk approach which is often the direction DR takes.

One way to obtain such information is the Business Impact Analysis. This contains the data provided by clients/customers. The BIA data can guide strategy development so we are able to put in place procedures that can be followed to avoid or reduce potential impacts.

Many CEO’s, if they really had an idea of the powder keg they are accountable for and the potential dollars that their company could lost, would certainly act on information available rather than waiting for an event to occur. The acceptance of the BIA in strategy development in Australia has yet to be fully realized.

The number of organizations in Australia that provide effective and viable hot-sites (medium-large), although growing, can still be counted on one hand. Large organizations, being the ones more severely impacted, must often resort to duplicate facilities.

The growth of suppliers and other third parties offering hot-sites or similar type arrangements for mid-range equipment, though long overdue, has been an exciting development in DR in Australia over the past two years.

At this stage, Australian governments (Federal or State) have not legislated to ensure that Financial Institutions have effective or demonstrable disaster recovery procedures in place. Like most DR planners, however, I believe that it is on the horizon.

As the 90s begin to unfold, it is hoped that organizations will begin to be more proactive by looking at the inherent vulnerabilities that threaten the survival of corporations (as well as the gainful employment of Disaster Recovery Professionals!).

October 24, 1990 is still fresh in the minds of the Center Parcs organization of Rotterdam, Holland because that is the day that the covered swimming pool in their vacation water park in Erperheide, Belgium was completely destroyed by fire. The fire started in the electrical transformer near the sauna building in the pool facility. Bungalowpark Erperheide is located in Belgium just south of Eindhoven, Holland, and consists of about 600 luxury apartments and bungalows secluded in the forest and arranged around several lakes. The park, with a nearly constant occupancy of almost 100%, is one of two in Belgium. Because of its occupancy rate and size, Erperheide generates substantial revenue for the Center Parcs organization.

The Loss of Two Vacation Seasons

One of the major problems associated with the loss was that the fire occurred just before the important 1990 winter vacation season. Because the park was essentially not operable without the pool facilities, most vacationers would be reluctant to make advanced reservations for a park that has primary facilities destroyed and may not be ready for the 1991 vacation season. This situation, coupled with a large time-related insurance policy, demanded that the facility be put back into service as soon as possible.

Center Parcs Construction Experience

The Center Parcs organization, under the guidance of Adjunct-Director J.W.M. Timmermans, is responsible for over $150 million in park construction per year. The organization has their own in-house engineering and construction group and has built a reputation for successful fast-track construction of parks throughout western Europe. The Center Parcs has many long-standing relationship with contractors, equipment manufacturers, and material suppliers throughout western Europe. These relationships have contributed greatly to the success of this reconstruction project because Mr. Timmermans was able to secure material and equipment suppliers’ commitments immediately. Because of the ongoing relationship with several contractors, Mr. Timmermans was able to select the contractors that were well known for their speed and quality.

Reconstruction Project Complexity

The Erperheide Bungalowpark is a complex facility that operationally depends on many system and facility interdependencies. The pool enclosure structure is an advanced technology concrete, wood, steel, and plastic structure that incorporates much of the heating, venting, and air-conditioning ductwork, and would be a challenge to construct in a normal timeframe. The large swimming pool and related equipment, such as the filtration and chlorination systems and the wave-making apparatus, present construction sequencing and coordination problems that are made even more difficult when constructed in an accelerated reconstruction schedule, as is the case with this project.

The auxiliary facilities, such as the children’s pool, the sauna building, and the wild water ride, are additional areas that must be constructed as individual facilities, and yet be incorporated in the overall project such that they may be completed without interfering with construction of the large pool and enclosure structure. The central building heating plant is a support system that must be constructed as an individual facility and incorporated in the overall project plan to support the construction requirements for the other facilities.

Demolition

Because of the complexity of the facility and the vast number of interdependent systems, a detailed engineering analysis and determination of the systems to be saved and salvaged would be expensive, difficult, and time-consuming. A broad-based demolition plan based on a reconstruction design by Center Parcs was put into place. Because the original elevated slab was complex with many different types of concrete masonry units and bricks supporting an elevated foundation, the determination was made that a simple slab-on-grade would be used to replace the original platform slab, and the demolition was planned accordingly. The extensive damage to the large roof support structure footings precluded reconstruction of the same footings in the same location because of the time required for demolition prior to reconstruction.

Reconstruction

The major problem with the original structure was that very large glue-laminated wooden beams that were over 30 meters long were used and could not be immediately replaced by the largest laminated beam manufacturers in Europe. This was partly because the wood treatment pressure vessel that was used for the original beams was no longer available, and the beams could not be moved into the area because of traffic restrictions. Because of this, the decision was made to build the new structure of smaller laminated beams around the original structure. This decision also allowed many activities to begin immediately and be worked concurrently. These activities include the following:

  • Finalize the redesign of the pool enclosure structure and the required mechanical systems as soon as possible based on the design decision.
  • Immediately start construction of the structural footings outside of the original structure while the demolition of the original structure is underway.
  • Order all the destroyed HVAC, mechanical, and electrical systems equipment as a function of the redesign rather than as a function of a long functional testing and rehabilitation program.
  • Develop a project site layout and organization plan as a function of the redesign and the current site configuration. When the structure was originally built, the surrounding park infrastructure was not well developed and posed no problem to construction. Because the park around the pool structure is functional, extensive constraints are imposed on the building of the structure.
  • Select a pool enclosure glazing material that would allow fast manufacture, delivery, quick erection, and ongoing work activities underneath while the roof is being installed.
  • Develop rapid reconstruction methods that were designed for the particular structural components used in the reconstruction project. All design decisions were weighed with respect to quality, safety, and rapid reconstruction constructability.

Major Project Milestones

The project milestones for this reconstruction project are a function of the completion of important structural items, manufacturing and delivery of important equipment, and delivery of materials. Some of the major milestones are as follows:

  • Completion of the central heating building.
  • Manufacture and delivery of the heating systems equipment and associated hardware.
  • Construction completion of the structural footings for the pool enclosure structure.
  • Manufacture and delivery of the main pool enclosure glue-laminated wood and steel beams.
  • Construction completion of the pool enclosure structure and subsequent heating of the building.
  • Installation and completion of the pool area floor heating system that will allow the tile setting operation to start.

Other project milestones were developed in the form of a project milestone summary schedule and were changed as a function of completion of the items.

Catastrophe Reconstruction Processes

Several of the catastrophe reconstruction processes that were utilized to speed the reconstruction project contributed significantly to the rapid reconstruction of this project. Some of these processes included the following:

  • Critical point scheduling: a system of milestone, intermediate, and detailed scheduling that highlights the critical activities (not just the critical path) for daily analysis and problem resolution.
  • Develop a strong project team: working closely with contractors and suppliers in the past has established a good atmosphere for rapid completion. Perform extensive planning and scheduling operations with the project team at the project site.
  • Secure the services of contractors and suppliers through the holiday periods with extra payments that will save time-related expenses that would be caused by an extended schedule.
  • Utilize temporary weather protection enclosures like sprung structures or circus tents to protect selected areas of the project during the winter weather.
  • Make extensive use of equipment that will save labor time and costs. An example is the use of several hydraulic manlifts to reduce the time required for climbing.
  • Materials management techniques utilization. Management of materials from procurement and manufacturing to delivery and storage and final placement in the work area.

The combination of a capable engineering and construction organization in the form of Center Parcs, with the catastrophe management experience of Evans American, helped to considerably reduce the length of the project schedule and hold the line on time-related costs. The project team consisting of Center Parcs, Robins Takkenberg, and The Evans American Corporation worked together to reduce the overall cost of the claim. This type of immediate catastrophe management response and coordination can contribute to reducing the total claim cost for any type of catastrophic project.


Michael Stall is the vice president of construction with the Evans American Corporation.

This article adapted from Vol. 4, No. 2, p. 49.

October 30, 2007

How to Travel During A War

Written by

With the termination of the war against Iraq, the threat of terrorist activity to “avenge” American interference in the Gulf region is becoming an increasing concern for American businesses. The airline industry is one of the most frequent targets of terrorism; unfortunately, it is impossible for many businesses to restrict or halt all travel. If you must travel, especially out of the U.S., here are few tips to mitigate your risks both while travelling and staying in a foreign country.

Packing

1. Avoid superfluous items with your carry-on luggage

  • excess credit cards
  • police/military ID
  • letters and documents
  • products marked with your organization’s logo
  • expensive or religious jewelry

2. Depending on your situation, you may want to bring

  • a photocopy of your passport, tickets, and other essential information
  • a list of your credit card numbers and loss notification numbers
  • enough required medication to account for delays

3. Travel with secure luggage

  • use hard, locked, non-descript luggage
  • use closed-faced luggage tags
  • keep essential items with your carry-on bag in case of lost luggage

Travelling Safely

  • try to travel off of peak hours, when large numbers of people won’t be congregated in one area
  • reconfirm your flight several days in advance and arrive at the airport early
  • move immediately through the security check area to the departure gate or airline club
  • wait near a protective barrier and structural support column or wall, away from expanses of glass, garbage cans, luggage lockers, telephones, ticket booths, and vendor’s carts, where bombs may be hidden
  • note the emergency exits

On the Airplane

  • select a window seat near an exit, toward the back
  • check under your seat for left luggage
  • don’t give any personal information to fellow passengers
  • count the number of rows to the exits and note how to operate them so that you would be able to escape in the dark or smoke

Arrival

  • ask anyone meeting you to be inconspicuous
  • know who will be meeting you, their name and appearance. Call to confirm any changes.
  • If you will be staying for more than two days, register with the consulate.


The Hotel

  • ask for a room off the ground floor, but not so high as to prevent escape in case of fire
  • check your exit options, count the number of doors to the nearest exit, and learn how to operate the escape systems
  • when leaving the room, leave on a radio or TV to give the appearance that you are in
  • do not leave your key at the front desk
  • do not leave valuables or important information (such as your itinerary) in the room.

While You Are Away

  • stay alert and aware of your surroundings; keep an eye on belongings and your hand on valuables
  • do not pay with large bills or count currency in public
  • know the locations of nearby police stations, embassies, or military barracks
  • stay away from crowds and walk away from any disturbances
  • learn how to operate the telephone and carry enough coins or tokens to use a pay phone
  • always carry your passport (a legal requirement in many countries) and guard against pickpockets
  • carry the phone numbers of emergency contacts--your host, organization, and embassy
  • keep your distance from the curb

Margo Young is a staff writer for the Disaster Recovery Journal.

This article adapted from Vol. 4, No. 2, p. 62.

During early 1990, the administrative headquarters at Digital’s customer services division, located in southern England, was burned to the ground, resulting in six VAX 6000 systems and approximately three hundred terminals suffering severe smoke damage. Despite high-profile occurrences such as the DEC fire, it is estimated that within the United Kingdom, over three quarters of the medium-to-large computer users have yet to formulate a workable disaster recovery plan, despite the recognition that their systems are essential to the running of their businesses. DEC was fortunate, as a computer manufacturer, to have access to spare machines upon which archived data was reloaded; even so, it was admitted that the loss of the building would cause weeks of disruption.

As serious as the DEC fire was, computer users often appear reluctant to protect the vulnerability of their systems. This often is not due to a lack of awareness, but to the lack of will to take positive action.

However, disaster recovery means different things to different people. The provision of third-party services has increased substantially over recent years. In days gone by, when batch systems predominated, loose arrangements often existed within users of comparable systems; this was increasingly seen as risky, inadequate and unenforceable. Cold facilities were then offered as it was argued, with a certain degree of truth, that the computer facility itself and supporting environmental systems were difficult to reinstate, computer equipment was easily available and timescales were not so stringent allowing acceptable delays. To overcome these shortcomings, companies offered access to a compatible machine in return for an annual subscription fee. Quite often, these companies were processing bureaus, seeking alternative business for declining bureau activity, and incorrectly, seeing disaster recovery as an easy and lucrative way out. Needless to say, companies with this attitude fell by the wayside as the true requirements and commitments of disaster recovery were realized. However, a large number of offerings are available, possibly out of proportion to the number of installations.

In the United Kingdom, there are currently in excess of 35 organizations offering recovery services of some description. An industry survey took place in 1987 to project the growth of disaster recovery services within Europe. Ninety-five percent of the market was found to be in the hands of independent suppliers.

Figure 1 illustrates the expected growth of the market by sub-sector over the five year period to 1992. Detailed analysis of the results indicated that mobile recovery facilities represent the highest growth segment.

Figure 2 shows the expected growth of the market broken down across those countries which are considered to be the major market for these services in Western Europe. Nearly all suppliers of recovery services operate within the confines of their own national markets.

From a purely data processing point of view, a large number of installations have signed up with them to provide emergency processing. Too often, a fixed provision is made within the data processing budget for a recovery service to replace equipment in the event of an adverse occurrence. At Alkemi, too often we have found that the existence of a properly prepared, tested and updated plan is a rarity. On a large number of occasions, the whole problem has been laid at the door of data processing management. Under these circumstances, the recovery process has incorrectly become a purely data processing issue. The recovery strategy consists entirely of resurrecting all systems without identifying those whose non-availability would inflict serious damage to one or more key areas of operation.

Alkemi recently carried out a survey to investigate this apparent lack of formal planning. Representatives from 270 companies known to have had an active involvement with, or interest in, disaster recovery planning, were recently interviewed. The object of the survey was to establish how successful they had been in implementing disaster recovery plans within their organization.

Only 88 companies had a plan in place or in progress, although 58 of the companies spoken to were unable or unwilling to comment on the presence or absence of a plan.

Of the 124 with no plans, 68 stated that this was due to lack of time and resources; such resources as were available were usually diverted to activities considered to have a greater priority. Lack of management support was the reason given by 50 of the 124; six blamed lack of funds.

Furthermore, 102 of the 124 companies with no contingency plan did have an individual designated as “responsible for security and contingency planning” but received no management support. This indicates that senior management had shown little interest or were ignorant of the risks, or that established operational recovery procedures probably involved the immediate computer facility only, without regard to the prime functions of the organization.

At Alkemi, we are in possession of a case study concerning an organization which, although it had subscribed to a third party recovery service, did not have a full plan in place. The main points are as follows:

Due to the lack of both fire protection and detection equipment, a fire in an air-conditioning unit caused severe smoke damage to all installed computer equipment. By the time the day shift arrived, the fire had fortunately burned itself out. It was not until the Marketing Director arrived that positive action was taken. A service engineer was called who attempted to clean the machine, while at the same time relocating it to the conference room. An unsuccessful attempt was made to power up the machine. It was then decided to invoke the disaster recovery contract; replacement equipment arrived and was installed in the conference room which had been converted into a makeshift computer room. Backup data was loaded and reconfiguration took place to emulate the original machine as near as possible. Following re-ordering, it was anticipated that replacement hardware in the form of either repaired or completely new equipment would replace the recovery machine within three to four weeks. The organization was fortunate that the fire was both self-contained and self-extinguishing, allowing the adaptation of the existing premises and normal operational housekeeping procedures to enable recovery to take place. However, had the floor above the computer room containing a research laboratory, full of flammable substances been ignited, the entire building would have been lost and the final outcome substantially different.

Europe has generally lagged behind the United States as far as statutory protection requirements are concerned. Within the United Kingdom, there is currently no statutory obligation for companies to have proper disaster recovery plans in place, although peripheral legislation requires certain, but somewhat vague, action to have been taken.

For example, under the 1987 Banking Act, auditors are required to report to the Bank of England on the adequacy of internal controls, and one of the areas of concern is that of business interruption. The Bank of England guidelines state that “there should be adequate recovery procedures or standby arrangements in place and tested to call on when events occur which cause computer systems to fail.”

Also in the United Kingdom, Building Societies (similar to the U.S. Savings and Loans corporations) are governed by the Building Society Act under which auditors are required to report to the Building Societies Commission in respect of the adequacy of recovery arrangements. Under the Financial Services Act, an application for authorization by the Securities Association includes the question, “are backup facilities available should the deal recording, reporting, settlement and accounting systems fail?”

The Data Protection Act requires that computerized systems are both accurate and properly protected against damage, accidental or otherwise. As an aside, the Data Protection Act does not extend to data held in paper format. None of the existing legislation is absolutely specific, nor does it state as to how protection is to be achieved.

However, the Computer Abuse Act was recently introduced by Emma Nicholson, who, before becoming a Member of Parliament, had a career within the computing industry. The Computer Abuse Act is the first UK computer security legislation specifically designed to combat hackers and the abusers of computer systems. It is now planned to introduce a more comprehensive computer usage bill which would include disaster recovery provisions. Users would be required to comply, by law, with certain minimum standards for maintenance, support and upgrades. The move is supported by the Computer Services Association, which has been pressing for European laws to reflect current United States legislation.

A further spur to this will be the requirement for European companies to communicate more effectively with the introduction of the European Single Market in 1992. Comprehensive disaster recovery plans and facilities will have to extend throughout these organizations and throughout the continent. The way should be paved for collaborative efforts between countries to ensure that mutual systems are effectively protected against disruption.

At Alkemi, in addition to our consulting and training activities, we have also attempted to address the awareness problem. The Survival Pack consists of two items: a book, “The Survivors Guide to Standby Services”; and a video, “The Survival Game.” The book, as its title suggests, provides those responsible for this area with a guide to the current market offerings and also provides an insight into the need for proper contingency planning.

The video is used as an awareness tool to ensure the essential support and proper funding by senior management. The Survival Game graphically illustrates the need for disaster recovery planning as a business requirement without resorting to lurid accounts and horror stories.

While the Survival Pack is not intended to be the definitive guide to disaster recovery planning (this is covered by our major work, Computer Risk Manager), it does address one of the key areas and should help ensure that the overall project receives the correct level of support to ensure its success.


Steve Watt is a consultant with Alkemi Limited in Berkshire.

This article adapted from Vol. 4 No. 2, p. 42.

At midnight, December 31, 1992, the entire economic structure of Europe will change dramatically. At that time the European Community (EC) will unite into a single common market. This unprecedented act where currency, information, goods, people, and services may move freely among the 12 EC member countries will alter the way U.S. companies (with foreign interests) plan for disaster recovery. The EC member countries include the following:

  •  Belgium
  • Denmark
  • France
  • Greece
  • Ireland
  • Italy
  • Luxembourg
  • The Netherlands
  • Portugal
  • Spain
  • The United Kingdom
  • West Germany

Since the Single European Act was adopted in 1987, many U.S. companies have already begun moving their information processing activities closer to their foreign customers. Notwithstanding the most obvious reasons for moving their IS operations, e.g., cost, U.S. companies have one additional compelling reason, a need for uninterrupted business operations. Additionally, attempting to support subsidiaries in Europe from the U.S. may require very expensive satellite communications.

Companies moving their IS operations will face many challenges - lack of systems standards, different hardware and software support, telecommunication networks that predate World War II, and a lack of standard network protocols.

The need for disaster recovery has long been recognized in Europe, especially in the United Kingdom and West Germany. In fact, both these countries have had commercial disaster recovery centers available since the late 1970’s. Not only are Europeans less forgiving of errors and more demanding of quality and reliability, but the growing number of cases of disasters throughout the continent is alarming. This article examines the types of disasters that have affected international data centers in the hope that it will alert U.S. companies moving their IS operations overseas to the risks involved. Although many of the risks are similar to those in the U.S., others such as terrorism are not.

Even though the sample size is relatively small (statistically speaking), we believe that the data supports our position that risks abound abroad as well as in the U.S.

ANALYSIS OF DISASTERS BY TYPE

FIRE
Fire shares the spotlight with terrorism as the single largest source (17.5%) of disasters internationally. This compares to 13.2% in the U.S.. Of the case files where fire was the cause, the resulting damage was the most extensive. In virtually all instances, the data center and the building itself were totally lost. In one particular case, The National Bank of Australia lost an $8 million IBM 3090 processor just weeks after it was installed.

TERRORISM
Terrorism is one risk event that most U.S. IS managers have not had to deal with. In fact, terrorism only accounts for 3.4% of computer-related disasters in the U.S.. Unfortunately, the occurrence of terrorism internationally has accounted for as many incidents as fire. However, the dollar loss resulting from acts of terrorism are far in excess of any other type of disaster incident. A favorite target of the terrorists has been the service industry, in particular computer companies. However, terrorists have not limited their activities to this industry sector, evidenced by the bombing of Interpol Headquarters in London this past year.

HURRICANE/TORNADO
The fact that hurricanes and tornadoes rank as high (14.0%) as they do is a bit of a statistical anomaly. If the Caribbean countries were left out of the international disaster score board, the percentage would be low. All the case files for hurricanes and tornadoes have originated in the Caribbean countries and Canada. In all cases the damage to the actual facility has been light, however, the resulting loss of power caused extensive outages. Although hurricanes and tornadoes can strike the continent of Europe, they are rare. The greatest risk for these types of storms are in Australia, the Caribbean countries, Japan, and Mexico.

EARTHQUAKE
Clearly the worst place to locate a data center internationally in terms of earthquake risk would be Mexico or the Philippines. Not only has substantial damage been caused to IS equipment itself, but IS staff’s experienced heavy loss of life and injury. In these countries where trained personnel are at a premium, loss of critical IS personnel is more devastating than the loss of equipment or vital records. In one particular case, a seemingly mild earthquake in Mexico City caused the loss of six senior IS personnel at a bank. The occurrence of earthquakes affecting data centers internationally is lower in the U.S. (12.8% U.S. versus 10.0% internationally). However, the actual occurrence of earthquakes is much higher internationally. The reason for the disparity is due to the heavy concentration of data centers near major fault lines in the U.S.

POWER OUTAGE
Surprisingly, power outages only account for 9.5% of international disasters versus 15.1% in the U.S.. This is primarily due to the wide acceptance of uninterruptable power systems (UPS) internationally. Reliable power has been more of a problem internationally than in the U.S. subsequently, companies have done much to insulate themselves from power problems.

SOFTWARE ERROR
Software errors resulting in extended down time are more pervasive internationally. Internationally, the occurrence is 8.8% versus 3.3% in the U.S. This fact is predominately due to the fact that there are less technically oriented IS personnel available internationally. In the case files we reviewed, the initial software error was relatively minor, however, a series of failed recovery efforts made them progressively worse. An argument could be made that the disaster event was caused by human error and not a software error, however, CPR classifies disasters by the event that triggered the disaster.

FLOOD
Canadian and Australian data centers suffer more from flooding conditions than anywhere else. The incidents of flood related disasters internationally is virtually the same as in the U.S. (7.0%). In most cases, the flooding could have been avoided by proper drainage. CPR research suggests a universal problem with municipal drainage systems in those countries.

HARDWARE ERROR
Hardware error related events account for approximately the same percentage of disasters in the U.S. and abroad. However, their causes are very different. Internationally, there are many types of computer systems in use that are either discontinued models from OEM vendors or very old. This fact is the primary reason for the occurrence of hardware errors. In the U.S., most case files relating to hardware errors have almost the opposite cause. In the U.S. case files, most hardware problems result from leading edge technology where the user was one of the first to install a particular piece of equipment. This was the case with a major U.S. company that lost 165 IBM 3380 HDA’s over the course of two weeks.

BURST PIPES
Burst pipes accounted for 3.5% of the data center outages internationally. However, in most cases the disasters were avoidable. Case in point, during an extremely cold winter, an IS manager decided to cut the heat off to a lights-out operations center to save on electricity. However, he forgot about the sprinkler system, which subsequently burst and flooded the entire data center.

NETWORK OUTAGE
Like the U.S., network outages account for a fairly low percentage of occurrences of data center disasters (3.5%). This low number is primarily due to the difficultly in uncovering network-related outages and the short duration of these types of outages. Internationally, however, CPR sees a trend where the incidents of network outages may increase as EC comes closer. The significant amount of new construction and the wide deployment of networks could make network outages more common throughout Europe.

CONCLUSION

Analyzing the causes of disasters internationally should give IS managers a sense for the trouble areas to avoid and realize that disasters are not confined to the U.S.

Both domestic and international users and IS management should be involved in the preparation and implementation of effective business recovery plans.


Tari Schreider is the President of Contingency Planning Research, Inc., a four-year old disaster recovery consultancy. He has responsibilities for contingency planning and disaster recovery strategies, new technology development, and risk analysis.

This article adapted from Vol. 3 No. 2, p. 14.

October 30, 2007

The Computer Recovery Facility

Written by

A profile of one company’s debut and involvement in the disaster recovery industry in Malaysia

Mr. Michael Tong, a naturalized Canadian born in Klang, Malaysia, realized the opportunity and need for Recovery Planning Services in Malaysia. The 42 year old business technology entrepreneur, while residing in Canada, saw this void and need in his native country. In 1986, Mr. Tong incorporated a company called STT Canada and with papers he had written and the assistance of the Canadian International Development Agency (CIDA), a division the Canadian Foreign affairs department, returned to Malaysia to generate interest within the country in the formation of a full service disaster recovery capability. His efforts resulted in the incorporation of ST & Telecommunications Industries Sdn Bhd, a Malaysian company, majority owned by Malaysia investors, the largest being Lembage Urusan & Tabung Haji (LUTH) at 51%, and the creation of one of the finest and most comprehensive recovery service entities established anywhere in the world. Thus the Computer Recovery Facility (CRF) was born and became operational in January 1991.

Attending the official opening of the Computer Recovery Facility were representatives of the Malaysian Government, the Canadian Government, the investors, invited officials from Brunei Darulsallam, Thailand, and Saudi Arabia, as well as 300 guests from the business community in and around Kuala Lumpur.

The CRF is the country’s most comprehensive supplier of Disaster Recovery Services. It is a full service disaster recovery facility, offering offsite storage services (OSS), computer recovery services (CRS) and disaster recovery services consulting, to meet the recovery needs of the information technology industry in Malaysia.

The 60,000 square foot facility is a stand-alone building with a single occupant and purpose--dedication to disaster recovery. It is situated close to Subang International Airport, 20 km from Kuala Lumpur city center and 15 km from any major industrial area.

The building includes fire protection and suppression systems, dust filters and vacuum systems, a “clean” room for media storage, closed circuit television surveillance cameras, alarm systems and an online key card access system. The building also sits on a track of land surrounded by walls and fencing and is patrolled by the facility’s own security guards.

Offsite Storage of Magnetic Media

The Computer Recovery Facility is unique in the recovery center industry in that it provides for storage of magnetic media on a commercial basis. The facility has five storage vaults (clean rooms) which provide for secure and environmentally safe storage. Each vault is physically separate and individually secured with a different level of security, one from another. Each of these vaults are independently monitored for temperatures and humidity fluctuations and have their own independent and redundant air-conditioning system. The present capacity of the vaults will house 500,000 reels of tapes or cartridges.

The CRF has its own fleet of vans for 24 hour pickup and delivery service. The vans are equipped with racking for tapes and cartridge cases, are air-conditioned and are staffed with two subscriber services representatives. The loading dock and receiving areas are equipped with air curtains and air showers to control temperatures and to maintain a dust free, clean environment. The controls for tape management within the facility are provided for by a customized application using a bar coding system. The application is PC-based and operates on a Local Area Network. It utilizes disk mirroring and is backed up on a second server. The vaults and the application are also protected from power failures or fluctuation by a UPS system and a diesel generator set.

Computer Recovery Services (CRS)

The CRS Division currently houses an IBM 4381 92E mainframe hot-site with associated peripherals. The hot-site is 4,000 square feet, out of a total raised floor area of 20,000 square feet. The hot-site and cold-site are individually controlled computer rooms with redundant air-conditioning, protected by a UPS system and diesel generators. Each area has its own level of security.

The subscribers are supported by a staff of technicians on site for assistance in software, hardware, telecommunications and environmental problems.

The CRS also has a customer recovery control room equipped with terminals, PCs, telephone system, fax and copying conveniences. There is also a kitchen, eating, and lounge area all conveniently located to the hot-site.

The recovery facility also includes the availability of cold-site (shell) recovery services. There is approximately 8,000 square feet of conditioned computer room space available for clients who wish to contract for this space, independently of the hot-site facility. Cold-site space is included in the hot-site service.

Plans are underway to install a second hot-site for the provision of services to Hewlett-Packard users.
The CRF and HP recently signed an agreement which will ensure HP users of comprehensive disaster recovery services, utilizing the expertise of both companies in the areas in which each specializes. The range of service is described in the following diagram.

Communications Recovery

To support subscribers in their recovery of computer processing in either the hot-site or cold-site facility, the CRF offers elaborate communications recovery support. At the present time there are 800 pairs of cable pre-wired into the building. The pre-planning of the cabling into the CRF included routing half of the cabling into the building from different routes, allowing for diversity and communication backup in the event of faulty or cut cable as a result of construction or other interruptions in service. Expansion of our communications service in the future includes plans for satellite transmission to effect telecommunication links with remote sites and linking up with international networks.

The recovery facility also plans to establish remote communication centers, when warranted, in other cities in Malaysia and South East Asia. These centers will include remote consoles, printers, and communications linked directly to the hot-site, allowing subscribers access to the CRF from their local locations. The remote locations will include facilities to house the subscriber’s staff.

Consulting and Subscriber Services

The CRF also offers subscribers consulting services. The consultants employed by the CRF are experienced staff with many years of planning, building and maintaining disaster recovery plans in the North American market place.

The services provided by the consultants to the subscribers includes education, training, security reviews, developing alternative strategies, business impact analysis, file backup and offsite storage reviews. A significant offering of this division of the CRF is conducting disaster recovery awareness seminars. The CRF recently held six such half-day seminars which were filled to capacity and well-received. The seminars are also offered to subscribers on an individual company basis and can be tailored to specifications.

Subscribers services is a department in the sales and marketing organization. The responsibilities of this group include scheduling client tests, administrative support, tours, orientation, briefings to subscribers, maintaining and issuing user manuals for both hot-site and offsite subscribers. They are the central focus point in account management and the buffer for all communication to and from the Computer Recovery Facility. Plans for subscriber services also include the formation of users groups and the development of a disaster recovery newsletter in the very near future.

To support the clients in the development of disaster recovery plans, the CRF recently signed an agreement with ChiCor for the rights to use two of their PC-based recovery software packages: Disaster and TRPS. Disaster and TRPS are in use in 47 countries around the world and with over of 1000 clients, 40% of which are Fortune 500 companies.

With the addition of these two comprehensive packages and the assistance of the consulting staff, the timeframe for the development and preparation of plans will be shortened significantly and TRPS can manage multiple sites and company business contingency plans.

Summary

The awareness and need for recovery services in Malaysia is quickly becoming a mandatory requirement with government guidelines and IT organizations focusing in on the vital aspect of business contingency and data center availability. The Computer Recovery Facility saw the void in the marketplace and has put together a facility and an organization to support the business community with a comprehensive service offering and leadership unequalled in this part of the world.


Doug Allan is the Center Manager at the Computer Recovery Facility in Malaysia.

This article adapted from Vol. 4 No. 2, p. 46.