Putting the Plan in Motion: A Checklist
By Frank Donaldson
Your organizations contingency plan documents have been assembled and distributed. The important parts of the business have been identified and contracts established for alternate sites or services. Now, if some natural or man caused event interrupts business, how do we ensure all this work will be used correctly? In major disasters, storm, flood, or fire type events dictate required implementation of planned alternative arrangements.
However, the more likely occurrence is a less than total resource loss such as failure of a critical computer or support equipment, loss of telephone service, or a small fire disrupting one department. In these situations, some structure is needed to determine if any of the pre-arranged alternatives are needed and if so, to what extent. This process has been called damage assessment, but it should be more. In the execution of a contingency plan, the assessment should be the transition between the end of the emergency (people safe, assets secured, etc.) and all actions to be taken next. The main goal of the assessment meeting should be to make sure those activities which are essential to the business are continued in some form and the appropriate actions are taken to ensure this.
Regardless of the causing event, any loss of facility, failure of computers, or unavailability of key personnel which causes significant business interruption, should be considered an emergency. After execution of necessary emergency procedures, an assessment must be made to determine the extent of damage, loss, or unavailability. With a plan in place and contingencies established, decisions must also be made based on the scope of the event, whether or not to utilize these back up resources. If the loss is temporary, the organization may elect to conduct business differently until the necessary personnel, capability, or system is again available. The company may also recover (repair, replace, restore) the resource, again, without reverting to established backup arrangements. In both these cases interim actions are still necessary to address customer needs and disrupted work. Because most back up arrangements take time to implement, this transition decision is even more important. An escalation timetable may need to be prepared or down time limit established identifying when alternatives must be initiated. If decision is made to activate back up services or facilities, information necessary to mobilize these resources should already be contained in the plan material. Many incidents have been documented where initial estimates of time to repair systems, replace equipment, or restore power were inaccurate. Situations like this require that the assessment process must be conducted many times and interim actions re-evaluated. This is where a pre-formatted assessment checklist can be valuable. We humans are very resourceful, but in crisis situations important steps to resume critical business functions can sometimes be overlooked.
Migration back to Normal Operations
While back up systems or facilities are in use, recovery must be taking place simultaneously. Recovery strategies involve getting what was damaged repaired or what was lost replaced or restored. These activities must occur regardless of whether or not back up arrangements are activated. In preparation for migration back to normal operations, we should consider review of many of the same decision areas which initiated transition to back up and recovery activities. A similar check list could be used to ensure all affected resources and activities are taken into account. Although more time may be available for these migration decisions, they are no less complicated. A PLAN EXECUTION chart has been included to help visualize these two critical decision periods.
An important factor in making this activity happen is senior managements expectation that an assessment will occur every time there is a significant interruption of a critical business function. It is also important that the assignment of responsibility to conduct emergency assessment meetings be clear; without this, responses to many incidents will be ad hoc.
The following is a suggested five step assessment checklist:
Step 1. Determine who needs to be present in the assessment. If only a single location or administrative area is affected, consider those responsible for this area as well as others from related or dependent departments. If an entire location or the organization is affected, Crisis Management Team representatives probably need to be assembled along with the leaders of other response groups who may be called to action. Resource related technical or special function personnel should be included based on the type of loss or damage.
Step 2. Determine where the assessment meeting will be held. If the interrupting event does not require facility evacuation, then a location inside the same building should be designated. For situations involving entire facilities, a command center, near the affected area should be identified if one has not been preestablished.
Step 3. Determine the SCOPE of the event.
A. Identify and list the affected administrative areas.
B. Determine who are the affected customers. Are they administrative areas or locations inside the organization, specific external customers groups, or a combination of both?
C. Determine if providers or vendors are affected. Identify each provider.
D. Identify and list all affected business functions.
E. Determine what resources are affected. Disruption of a department or facility will involve several supporting resource areas; consider each area separately.
The Event has caused loss or unavailability of:
FACILITIES: Describe facility loss or damage: Entire Building Partial Structure Damage Loss Unavailability.
PERSONNEL: Identify lost or unavailable personnel.
EQUIPMENT: Identify equipment items.
AUTOMATED SYSTEMS: Identify systems: Hardware Configurations Software Data Output Generation Input Capability.
COMMUNICATIONS: Identify communications equipment or services: Circuits Voice equipment/Services Data Comm Equipment.
PROVIDERS: Describe unavailable services or product items.
SUPPORT SERVICES AND MATERIALS: Identify affected forms/supplies or transportation/distribution services.
F. Determine the extent of the interruption:
MINOR: temporary resource loss-minutes to one business day.
INTERMEDIATE: extended resource loss-more than one business day.
MAJOR: multiple/critical resource loss-several days to weeks.
Step 4. Determine the appropriate response actions to be taken.
A. Decide if pre-arranged back up alternatives will be used.
B. If planned alternate operations will be used, decide what to activate based on those needed in place of resources unavailable.
C. Decide when alternate operations activities must begin. Establish a down time limit (minutes to hours) for interrupted functions. Determine what period of time the affected activities can be interrupted before alternatives must be initiated. Establish an escalation time table with response actions keyed to Event Plus (hours) time blocks.
D. Decide on a Business Response Strategy
DEGRADE OR REDUCE SERVICE RESPONSE: If capability or capacity has been reduced, activities take longer and less customers are served. Certain activities-normally part of the affected function-may not be done.
SUSPEND OPERATIONS TEMPORARILY: Cease operation of specific functions until alternative resources can be established or affected resources recovered.
WITHDRAW SELECTED OR ALL AFFECTED SERVICES: Refer customers to an alternate source of service or advise them when service will again be available.
PERMANENTLY STOP OR CLOSE OPERATIONS: A major replacement or reconstruction of facilities, equipment or materials is necessary. Provide information on replacement services to all groups affected.
E. Determine available operations alternatives.
Identify what actions will be taken to restore or maintain capability:
Use equivalent alternate resources available within the organization by emergency arrangement or prior agreement.
Obtain or use alternate resources from outside the organization by prior or emergency agreement with specific providers.
Complete the affected function(s) without lost or unavailable resources by utilizing different operational methods or manual procedures.
F. Identify available resources. Review pre-arranged resource alternatives. Determine minimum resources required to re-establish critical business functions.
G. Decide on actions required to continue operation of affected priority functions. Decide what current activities will not be done or the percentage of work which will not be completed. Decide what customers and providers should be notified. Determine how they will be contacted, by whom, and what they will be told.
Decide what arrangements are necessary to meet deadlines and what will be done if deadlines cannot be met. Decide what interim activities must be carried out. How will work in process be recovered and completed?
Step 5. Summarize response decisions and strategies.
A. Outline interim actions to be taken while alternate operations are being established or direct recovery of resources completed.
B. Summarize Business Response Strategy. Detail how loss of capability or capacity will be presented to customers of affected critical functions. Degrade Suspend Withdraw Stop Operation
C. Summarize Operations Response Strategy. Select a short term strategy which can be implemented quickly but is usable for a short period or until long term strategy is operational.
Select a long term strategy which takes a longer lead time to implement, but is sustainable for extended periods.
If alternate resources are necessary, execute a Back Up Operations plan. Identify and activate the use of pre-arranged alternative services, facilities, or materials through authorized personnel.
If lost resources need to be repaired or replaced Execute Recovery plans. Outline necessary reconstruction or replacement activities.
If planned alternative resources are not necessary, execute Direct Recovery plans. Outline interim activities to be conducted while recovery is completed.
These steps are not new information to contingency planners. However, getting a key manager or officer to conduct a thorough assessment of a business interruption may mean quicker and more effective use of back up resources. Consider including a variation of this checklist in your plan document. It will also provide a good record of event response actions and assignments.
Frank Donaldson is a management consultant with Donaldson Resources.
This article adapted from Vol. 5 #1.
DR World Main Index | Return to DRJ's Homepage
Disaster Recovery Worldİ 1999, and Disaster Recovery Journalİ
1999, are copyrighted by Systems Support, Inc. All rights reserved. Reproduction
in whole or part is prohibited without the express written permission form
Systems Support, Inc.