A disaster plan is a design, or blueprint, for action in the face of adversity. To be effective, the plan must be well thought out, taking into consideration the many complex factors that comprise a disaster.
In part 1 of this two-part series, we suggested a rigorous, yet simple to apply, method for the analysis of disaster based on flow charts. By flow charting the course of disaster we gain a detailed understanding that forms the basis of a systematic plan.
In this article, we discuss in more detail how a scenario-based risk analysis using flow chart techniques can be incorporated into the planning process. Also discussed is the critical phase of plan testing. Testing serves as more than just a means to make sure that our plans are working as intended. It can give valuable insight into the process that can be 'fed back' into the plan to improve performance. Iterative construction of systematic disaster plans - through a cycle of development and testing - add to the assurance that the plan will work when we really need it.
How to Plan
There currently exists a huge volume of articles, books and seminars on the process of planning for disaster. These provide a wealth of knowledge on the planning process. Experts in specialized areas such as off-site computer system backup, restoration of fire damage, public relations, access to replacement machinery and equipment, and many other crucial to the recovery effort provide information on coping with the effects of disaster in the most efficient manner. There are also available many case studies of actual disasters that detail how response were handled, and perhaps more importantly, how they might have been handled better. In this way, we learn from past disasters how to better cope with future ones.
Perhaps the greatest aid to disaster planners is the computerization of the process using a variety of software tools. These tools range from word-processing based programs that help us format, maintain and distribute our plans to true 'expert systems' that integrate the knowledge of disaster planning experts in an effort to provide guidance to beginning planners.
Disaster planning can get complicated. The speed, accuracy and memory capacities of modern electronic computers greatly reduce the attendant complications of planning. Most disaster planning involved is provided by the flow chart analysis.
After the need for action has been determined, disaster recovery planners as well as those in charge of the operation of our tanker fleet can be made more directly involved. Here are the 'paths' along our flow chart that we are concerned about: Now, how do we handle them? To help answer this question, we incorporate all that specialized knowledge of disaster recovery planning that exists.
We also identify any blind spots for which information might not exist, and attempt to create it from scratch. These pioneering efforts will, in turn, help those that may face similar scenarios in the future. Implementation of the focused plan is now accomplished using a variety of aids, including computer programs that let us capsulate the plan and make it readily available for when the need arises.
Once planning methodologies have been mapped to potential disasters there remains the not-so-trivial aspect of linking planning to the resources available to our specific organization. In this final phase of the process of systematic disaster planning process, we identify who will be responsible for the disaster recovery process at each stage outlined in the plan. The 'who' of disaster recovery planning includes disaster planners proper, organizational resources including the department or departments affected, and outside service providers.
Lining up the proper outside resources is critical. Disaster planners soon find that the organization can not do everything by itself, especially when potentially crippling disasters strike. Restoration companies, equipment vendors, alternate site providers, and others make up an essential part of the disaster recovery process specific to the organization. Whenever possible, these outside resources should be privy to the planning process (or at least the portion that involves them), so that they may offer constructive input based on their knowledge and experience.
The importance of teamwork among internal resources goes without saying. Assurances of commitment and competence in area of assignment should be a part of the systematic disaster recovery plan. Perhaps above all is the requirement of senior management commitment to the process. As this management is ultimately responsible (at least ethically, and often legally) for effective disaster recovery, this commitment should be readily forthcoming in most organizations. The next line of authority falls to the disaster planning organization.
This group, which often consists entirely of operating personnel, is responsible for direct administration of the plan. All members of the disaster recovery team are important, and they should be recognized as such.
Who is responsible for what in the disaster planning organization can be detailed using a variety of techniques within the plan. Responsibility charts and 'call lists' are a part of every comprehensive planning effort. By attaching names to duties, and by obtaining the individuals commitment to these duties, we literally make the plan come to life.
Computerization becomes indispensable to this part of the process. The volume, complexity and dynamics of the interrelationships mean that we will need to develop efficient planning aids that can quickly respond to changes.
Chart #1 shows how scenario-based risk analysis, planning techniques and organization-specific processes come together to form the core of a strategic disaster recovery plan.
As demonstrated above, all components are essential to the effective operation of the plan. And skimping on any of these components will result in a commensurate degradation of plan performance. Those responsible for the over-all planning effort must make sure that all the pieces come together.
An absolutely essential component of the disaster planning process is testing. Testing refers to the exercise of the plan under 'simulated' conditions. In effect, testing allows us to 'try out' our plans before they are actually needed.
The idea is that the worst time to find out your disaster plan is in some way defective is when you are faced with a real disaster. A scenario-based analysis can provide the structure needed for realistic plan tests.
Scenarios can be utilized for plan testing in a variety of ways. Most simply, we could run through the possibilities, i.e., the 'branches' of the flow chart, providing our test participants with a realistic representation of events constituting the disaster scenario under study.
This allows specialized testing while maintaining the simulation framework. Added realism can be introduced by simulating adverse consequences of initiating events based on their actual probability of occurrence.
Obviously the simulation would need to be sped up to compress the very large time frame within which small probabilty events occur to within a reasonable time period for study. This can easily be done by running many computer generated random numbers, based on the underlying event probabilities, through the flow chart and noting the outcomes.
If the flow chart has been set using a computer spreadsheet program this task is realitvely easy. Most spreadsheet programs have random number generators built in that can be used to emulate a variety of underlying probability distributions. In this way, planners can observe literally thousands of 'years' of experience in a relatively short time frame.
These outcomes would then be used as the cues to trigger the proper response. These random simulations, also known as Monte Carlo simulations, add a heightened sense of reality and excitment to the exercise.
The accompanying chart shows the outcome of 250 simulated 'years' of operation of a hypothetical transporter of hazardous chemicals. It is based on the probabilty numbers we developed in the risk analysis given in part 1 of this series. As we might expect, the most common outcome is 'no accident'.
When accidents do happen, most are minor (property damage only). However, the potential for disaster exists. This potential was in fact realized during our simulation, in year 205. There, an accident resulted in a cargo spill and subsequent fire. Damages totaled approximately $1,000,000.
An event chain such as this should trigger an appropriate recovery sequence.
Flow chart analysis can also be used to test plans on a more selective basis. A selective analysis can held identify 'blind spots' in the planning process.
For example, a branch of a flow chart could be chosen (perhaps at random) and the affected departments asked how they would respond. Unsatisfactory responses would indicate the need to bolster disaster planning in that area.
The information gained by such a scenario-guided mission is far greater than that obtained by simple questionnaires that ask 'Do you plan for a disaster?', or even 'How do you plan for disaster?'
Plan testing in this fashion also encourages a top-down approach to the management of disaster recovery planning. The outcomes of a secenario-based risk analysis show in detail the potential impacts of untoward events on the organization, and their relative likelihoods.
This information relates directly to the financial and operational management of the orgination. Serious outcomes that will surely peak the interest of senior management.
A natural response to potential calamities is 'What are we going to do about them?' Part of the answer comes from those responsible for the management of safety and the financial effects of such events.
Crucial is the response of those who will manage recovery in the face of disaster. When faced with these tough issues, the diaster planner must be able to reference a systematic plan for disaster recovery.
Mark Jablonowski, CPCU, ARM, is Risk Manager for the Hamilton Standard Division of United Technologies Corporation in Windsor Locks, CT.