Under the umbrella of contingency planning, there are many terms (business continuity, business resumption, continuity of operations, disaster preparedness, disaster recovery to list a few) that are used interchangeably. Depending upon the school of thought, the steps to create and maintain an effective contingency plan/program may also differ slightly.
There is, however, one step that all agree must be performed. Plans must be exercised. (I prefer the term "exercise" as "test" constitutes a pass or fail mentality.)
As plans must be exercised, exercises must be planned. Even unannounced exercises require some degree of planning.
Prudent exercise planning involves many tasks, including, but not limited to:
u Identifying scope and objectives
u Preparing project plans and other documentation
u Notifying participants
u Obtaining funding (where necessary)
u Documenting lessons learned and listing action items for improvements where needed
When planning and conducting exercises, one must not lull oneself and management into a false sense of security by: (1) focusing solely on technological issues and ignoring the people issues; (2) skirting around, band-aiding, or employing quick fixes for real issues for the sake of the success of the exercise and; (3) failing to follow up on action items identified in the lessons learned process to ensure real recovery capability.
Exercises should be rehearsals for reality. Reality will, unfortunately, expose all weaknesses or shortcomings in one’s plan(s) at a time when shortcomings cannot be tolerated.
It is much better to identify and address areas for improvement in a non-threatening, controlled situation than to sustain critical losses in a real situation because certain details were overlooked, ignored, or solved with a quick fix during exercise(s).
It is critical that exercises, even those with the narrowest of scopes, mimic reality as much as possible. One, therefore, must not plan and execute exercises simply for exercise’s sake – one must plan exercises to be able to recover.
Don’t Ignore the People Issues
Do not ignore the people issues. This seems fairly simple, doesn’t it? Thankfully, I find myself reading more and more articles nowadays that discuss the human side of disaster recovery/business continuity planning.
Real events have taught us that the most knowledgeable personnel may not always be available, for any number of valid reasons, to participate in the recovery effort. Prudent exercise planning should include rotating personnel for particular platforms so that all platform team members are familiar with the recovery process.
Table-top exercises could be an excellent tool for employee training – i.e. having the most knowledgeable staff present while the new or less knowledgeable staff members go through the table-top exercise. Input from new staff members could also produce new strategies for recovery.
Not to be overlooked also are the people processes in the BC/DR process. Typically, someone must issue a declaration of disaster to set the BC/DR wheels in motion.
Is there a documented Disaster Declaration Procedure in place? When is the last time that this declaration procedure was rehearsed? If more than one person is authorized, is there a plan of succession? If there is no documented plan in place, who is authorized to declare a disaster on behalf of the organization?
The answers to these questions are essential, particularly if the company relies on third party vendor(s) for its recovery. The people processes must, at some point, become a part of the exercise because they represent reality.
Avoid the Quick Fixes
In a relocation exercise, personnel, and equipment are moved from the primary location and attempt disaster recovery at an alternate site. Issues of network connectivity and equipment compatibility become apparent, particularly if a third-party vendor is involved.
Changes to the primary environment may involve equipment not carried at the vendor’s contingent site. This is particularly true in the case of Windows servers. In a non-virtualized environment, restoration of functions normally hosted on one brand of Windows hardware may not be possible on a different brand of Windows hardware due to issues with the Windows registry.
Make every effort to avoid having participants transport equipment (network routers, servers, and special tapes for example) specifically for the exercise. Would this equipment be available if the primary site was destroyed or rendered uninhabitable?
Avoid incorporating undocumented procedures or processes during an exercise for the sake of recovering an application/system. Is this something that could work in a real event? Does it comply with company regulations and policies? Are there interdependencies that could be affected if this fix were employed in a real situation?
If it works that well – take the time to document what was done and discuss during the lessons learned phase. Create an action item to find the answers to the questions above and any others that may result from the discussion.
In short, during an exercise, don’t employ a workaround to cover up a shortcoming. The Band-Aid may cover the scar on the arm, but it won’t help, in the long run, if the arm is broken.
Incorporate Lessons Learned
Exercising the body makes one feel better, perform better, and look better. Exercising BC/DR plans should make the organization feel better about itself and its ability to protect the employees and the business.
Exercising BC/DR plans should make those who have been designated as recovery personnel perform better should a real crisis occur as practice leads to perfection. Exercising BC/DR plans on a periodic basis will certainly make the organization look better to potential customers/clients for it clearly demonstrates a commitment to reliability and the provision of continuous service.
For me, the major value of an exercise is the identification of the shortcomings – the issues. "What do we need to do to get better?"
Identifying and documenting the lessons learned from an exercise are the key steps to getting better. To simply identify and document the lessons learned is only the first step, however. Without an action plan, the "Lessons Learned" document will be just another report to be filed and the exercise simply becomes another completed project.
Here are some steps for incorporating lessons learned:
u Review each lesson learned in terms of its value to the established BC/DR process or plan
u Develop action items – assign the action items to the appropriate individual(s) or group
u Set dates for follow up.
u Obtain management commitment for support and the resources to complete the action items
u Educate – through meetings and publications
u Update the existing BC/DR plans to incorporate the changes
u EXERCISE ... not just to exercise, but exercise to recover
James O. Price Jr., CBCP, is the business continuity/disaster recovery coordinator for the State of Georgia Technology Authority. His responsibilities include contingency planning for the State Data Center, Operations and Business Continuity for GTA. James is an instructor for DRI International. He is also an instructor for the Community Emergency Response Training (CERT) program and a team leader for the Atlanta Red Cross Eastern District Disaster Assistance Team. He is a charter member of the Atlanta Chapter of the Association of Contingency Planners, a member of the Southeastern Continuity Planners Association, a member of the Southeastern Business Recovery Exchange and serves as treasurer of the Fulton County Local Emergency Planning Commission.
"Appeared in DRJ's Summer 2007 Issue"