Is your VCR programmed properly? Not sure? Maybe you couldn’t understand the directions in the manual.
VCR programming directions, camera manuals, tricycle assembly sheets — all are standing jokes in the technical writing community. One can reread these directions endlessly as you:
1. Stare into space and wait for inspiration, or;
2. Try holding your tongue differently.
All these tactics don’t help when the documentation is not usable. Just because you have spent a lot of money on something does not mean the documentation will help you operate, install, repair or use your purchase.
Documentation is not usable if it does not answer a specific question, information is presented in a confusing manner or information is incorrect.
The result is the documentation remains on the shelf and you flounder, using trial and error. Eventually you may get your VCR programmed or the tricycle assembled. But you leave the experience unhappy, disappointed, confused and thinking there must be a better way to do this task.
Now, think in terms of a disaster recovery plan - infinitely more expensive and important than a VCR programming feature. What can be done to help ensure your disaster recovery plan will have usable documentation? There is no single magic bullet. But the following nine steps address some of the most common documentation considerations.
Develop a good disaster recovery plan
You cannot expect documentation to make lucid a chaotic disaster recovery plan. A plan may be complex, but it must follow an internal logic. There must be a rational flow. When a disaster recovery plan is developed you cannot rely on documentation to make up for flawed organization. The best disaster recovery plan documentation is obtained when goals, methods and procedures, and desired results are clearly known.
Be prepared for documentation resource cuts
This is the inevitable money and time chase. When a disaster recovery plan runs into fiscal or time crunch, resources originally dedicated to documentation are usually used to balance the ledger. If the disaster recovery plan is over-budget, documentation funding may be cut. If the disaster recovery plan is overtime, the time may be taken from documentation effort.
This is not a fact to be bemoaned, it is a fact of corporate life. The question is, how to deal with it and still develop a usable document.
A way to deal with anticipated cuts is to develop a layered documentation plan. Plan for the documentation to consist of elements A, B, C, D and E. These elements can be, for example, a core documentation set (A), user guides (B), documentation subsets for certain parts of the organization (C, D, E), and so forth.
Give these elements a priority. If time and money is pulled, the least critical documentation elements are dropped to compensate. The goal is to protect the main project with adequate funds and sufficient time. If the entire disaster recovery project goes more smoothly than expected, the bells and whistles can be kept.
The key to this is to know:
- what is essential
- what is needed
- what is really nice and
- what would be handy
Then if needed you can start cutting back, but still end up with something valuable to the organization.
Know what you want the document to do
A disaster recovery plan can involve many players with many agendas. Training people may see an opportunity to develop a training document. Marketing people may see benefit in using the disaster recovery plan as a sales tool that indicates your organization’s commitment to uninterrupted service. Managers may see it as providing proof of need to superiors for increased funding or staffing.
It is an axiom that a document that tries to do too much does nothing. Yet the research that goes into developing a disaster recovery plan will have many uses. The urge will be strong to have many tangible uses for a document that, one hopes, will never be used for its primary goal — recovering from a disaster.
It is best if the disaster recovery document is used only for disaster recovery or disaster recovery training. Other separate documents should be created for the other stakeholders. In some cases, these subset documents may be literal copies of the disaster recovery document. Usually, however, the subset documents will have to be modified to suit the particular needs of marketing, training or other uses. This means increased cost and more time to produce these subsidiary documents.
All of this should be considered carefully when you start mapping out the disaster recovery documentation. But the goal should be that a given piece of documentation has one objective. If you try to force a document to do too much, you will have a document that confuses more than it clarifies.
Know your audience and write to it
Equally important as knowing what the document intent, is knowing who will read and use it. Know your audience and write to it.
A highly-technical organization will have its own jargon and preferred format. If using this jargon conveys the information, use it. Even if this is contrary to the general notion that jargon should be avoided. If that is the way your audience communicates, use that language. If your organization is less technically oriented, avoid jargon and take some effort to define terminology as you go along. As a guide, listen to the way members of your audience talk. This helps develop the tone of the document.
Determine access to the document before you write
Determining who will be able to read the disaster recovery document is another way of defining your audience, but it can raise some difficult issues.
If you talk to people involved in the disaster recovery plan, you will probably find them divided between two camps; those who favor as wide a distribution as possible for the documentation and those who favor quite restricted access. Both viewpoints have merit.
Those who favor broad distribution will say that in a time of crisis it will be necessary to marshall as many people as possible to recover operations. The more people who know about the disaster recovery plan, the better. In addition, if the plan is restricted to a few, those people may not be available to help implement the disaster recovery plan. No one else knows how to access it or understands the information it contains.
Those who favor restricted access argue a disaster recovery plan contains sensitive, proprietary company information. Security concerns mandate that only a few key individuals have access. Since these key individuals would be spearheading recovery anyway, broader distribution could lead to confusion. People could feel free to make their own interpretations about their role in the recovery.
It helps to resolve this access issue before the documentation is written. Duplication and distribution costs are dramatically affected if you know you need 10, 100, 1,000 or 10,000 copies of a given document.
Document the plan as it exists, not as it will be
Most disaster recovery plans are evolving creatures. It is very tempting to look down the road — knowing it will be several weeks or months before the plan is published — and think that by publication date such and such will be in place. Therefore, the document should take this into account now.
The argument for documenting what is still in the planning is that it will reduce document update costs and the document will be current when it comes out.
While this sounds good, virtually no equipment, policy or program works as expected. So you end up documenting an assumption as if it were a fact. When this proves to be inaccurate, even in a small detail, the credibility of the entire disaster recovery plan suffers.
Document the plan as it exists today. If you know there will be changes shortly, think about deferring documentation of that specific area until the changes are made and the results known. Or, plan to update that changed section of the plan quickly after the document is published.
Make the document easy to read and as intuitive to use as possible
This is probably the section you thought you would be reading on the topic of usable documentation.
A competent technical writer will know about:
- information mapping methods of organization
- using short sentences in paragraphs to convey information
- using bullet lists for complex ideas
- using graphics to support the written word
- using a large point-size serif typeface for readable body copy
- developing a comprehensive index
These elements affect the appearance and readability of the document. If these elements are not included, the document is simply more difficult to use. If a document is difficult to use, generally it will not be consulted.
No competent document is produced on a ninepin dot matrix printer any more. Technology has made it easy and relatively inexpensive to produce graphically-pleasing documents.
But the most graphically-pleasing document will still not be consulted if all effort is poured into design, and none to the other issues previously discussed.
Test the document
Software is beta tested before release.
Products are prototyped and tested with focus groups before marketing. Yet too often documentation is not properly tested before final production.
Usually, the testing phase is dropped as deadlines press. When this happens, the first issue of the documentation becomes the beta version. Errors or ambiguities will surface and will have to be corrected in a later issue. If at all possible, it is better to assemble focus groups and have the documentation thoroughly examined before final production. Using the documentation as an integral part of a mock disaster recovery is an excellent way to point out weaknesses.
Don’t be disappointed if such testing indicates a need for significant changes. It is much better to find this out during focus group or drill testing than during an actual disaster.
Finally: Keep It Simple, Stupid.
Being able to convey complex ideas simply is the acme of the writer’s art. It is for this ability that they are paid.
KISS has a corollary: When in doubt, leave it out.
The tendency in disaster recovery documentation is to include every possible piece of information which may, in some set of circumstances, prove useful.
While the notion is nice, this leads to documentation measured in feet, not inches of paper. Be certain that the information placed in the documentation is placed for a reason, not for a whim. It is always easier to add information to a document later than to take it out.
Treat words as the expensive commodity they are. The size of a document is no indication of its value. The larger the document, the less usable it becomes, since people may become intimidated by it. They feel they cannot find the information they want, therefore they don’t try.
Donald F. Wallbaum is a partner in MillerUpton Wallbaum.Printed In Winter 1994
Developing a Contingency/Disaster Recovery Plan requires developing questionnaires and conducting interviews with virtually all departments in the organization to gather information. The result of this effort is stacks and stacks of information.
One of the main questions faced by the Contingency Planner is how to sort through all this information to ensure that everything has been addressed and how to organize the information into a workable plan?
Experience has shown that an effective plan must contain all the necessary information to activate the Plan, assess damage, implement a predefined recovery strategy, monitor the recovery process, and restore business as usual. That seems very straight forward and should be able to be done with all the accumulated stacks and stacks of information. Just put it into binders and distribute it to the Recovery Team members.
The problem is that the Plan must also be easy to read and understood by all levels of the organization. Management, the Recovery Team members and employees all need to understand the Plan.
Everyone should know the need for recovery planning, the recovery strategy, what needs to be recovered and in what time frame.
In addition, they all need to know who is responsible for the various activities, and what resources are needed. However, only specific teams need to know how to carry out the recovery activities assigned to them.
Therefore, the Plan, (that is the formal document that is distributed to Management, the Contingency/Disaster Recovery Organization and used to communicate to all employees), need only contain the following information.
The WHY (need for recovery), the WHAT (critical processes and resource requirements), the WHEN (critical time frame), the WHERE (recovery strategy), and the WHO (recovery team members and support organizations).
Of course the recovery cannot be accomplished without the HOW information. That is the detailed procedures and information required to carry out the actions identified and assignead to a specific recovery team.
This information should not be in the formal Plan. It is not germane to the Plan document itself, but is essential for carrying out the recovery. Putting all this detailed information into the Plan document makes it confusing, hard to understand and creates a maintenance nightmare.
Each recovery team needs to understand their role in the recovery process. They also need to understand what the roles of the other teams are and how they interact with each other. This provides a cohesive and effective Plan.
The Plan must be easy to maintain. The Contingency Planner is the most likely person to be responsible for maintenance of the Plan. However, they cannot and should not do it alone.
The Contingency Planner oversees maintenance. Each recovery team shares in this responsibility in that they are individually responsible for ensuring that the detailed procedures and information necessary to carry out their respective team actions are in place, and kept current.
The Contingency Planner ensures that this is done by developing and communicating a maintenance program that assigns responsibility and frequency of reviews.
In addition, the Plan must be able to be tested and audited. The Contingency Planner is responsible for ensuring that the Plan is tested and he conducts audits.
The Planning process is never complete. Testing the Plan will more than likely find problems that need to be resolved. This requires changes to the Plan or maintenance. The changes are communicated to the recovery team(s). That’s training. Then the Plan is tested again. That’s more training, possibly uncovering more changes. On the other hand, changes in processes or the business require a review of the Plan, resulting in changes that need to be made, communicated to the teams and then tested. The life cycle of maintenance, training and testing continues.
This being the case, lets try to make the process as easy as possible without sacrificing the quality of the Plan.
The first step in developing a Contingency/Disaster Recovery Plan is to determine what information should be in the plan, and how this information is going to be organized.
The format presented here provides all of the above and has a proven track record.
The size or complexity of the organization, nor the type of business has affected the basic concept. However, it must be noted that the content of each plan is customized for the specific organization thus making each Plan unique.
The concept of the format is very simple.
The entire plan consists of two distinct but supporting documents.
The first, I call the Plan (that is the document that is distributed to the Contingency/Disaster Recovery Organization).
The Plan contains only the WHY, WHAT, WHEN, WHERE, and WHO. The second, I call the Detail Reference Material that supports the Plan or the HOW information.
This information is compiled and maintained at safe and secure offsite storage location(s).
The Plan document consists of five sections as follows:
Section I - Introduction to Disaster Recovery
The intent of this section is to provide the background as to why the recovery plan is needed, identify the purpose, objective, and scope of the plan.
In addition, this section states any assumptions in order for the Plan to work. It should also define a disaster, show the reader how to navigate through the Plan, describe the contingency/disaster recovery organization or how the teams are comprised.
In addition, this section explains how the plan is maintained, tested, as well as how training is conducted. Plan distribution is also covered here. Average length of this section should be 7 to 10 pages.
Section II - Plan Overview
This section should provide the reader with a brief idea of the recovery strategy adopted by the organization, a narrative of the criticality of processing, the recovery time frame(s) and provide a management check list of major actions that may take place during a disastrous situation, including the restoration activities and returning to normal. Average length of this section should be 8 to 12 pages.
Section I and II when complete comprise a Management Summary of the Plan and can be used to brief management and provide a general overview of the Plan to the recovery teams and all employees.
Section III - Contingency/Disaster Recovery Organization Responsibilities and Activities
This section should describe the disaster recovery teams’ responsibilities and the detailed actions they perform, to assess damage, activate recovery procedures, monitor recovery progress, as well actions required for restoration and returning to normal processing at a permanent site.
These actions are one or two line statements with references to other parts of the Plan as appropriate.
Average length of this section should be 15 to 30 pages depending on the size of the organization.
Section IV - Notification Procedures
This section is used to identify all the areas that need to be notified and for what reason. It is extremely important that this section be complete and kept current. It contains names, addresses, phone numbers, and other pertinent information (i.e., site ID’s, contract numbers, etc.) necessary to activate the Plan and begin assessment, and recovery operations. Average size of this section should be 20 to 35 pages depending on the size of the organization.
Section V - Reference Material (appendices)
In Section III all the actions that may need to be taken are identified and assigned to one of the teams, but the detailed procedures on how to carry out the specific action is not included.
This section contains index listings of the information and detailed procedures required to support the actions of the recovery teams. This section contains only the titles and the offsite location where they are stored.
The actual detailed information and procedures for each recovery team would be in separate Contingency/Disaster Recovery Team Books located in safe and secure offsite storage location(s). Average size of this section is 20-30 pages depending on the size of the organization.
Note: This section would not contain the bulk detail of information and procedures. It would, however, provide an audit listing of them, the areas responsible for updating, and the location(s) where they are kept. The offsite information is the detail on HOW to perform the recovery actions.
The Plan format provides WHY you need to recover, WHAT needs to be recovered, WHEN the recovery needs to be accomplished, WHERE the recovery will done and WHO is involved with the recovery process.
The Plan format also provides a listing of the HOW procedures in an organized manner, but not the detailed instructions.
Those are in the actual Detailed Reference Manuals stored in an Offsite location. So all the stacks of information accumulated through the questionnaires, and interviews are used, but it is assembled so the appropriate team that needs it can use it.
When using this format you will be using references to other Sections and Subsections of the Plan.
Therefore, a good numbering scheme and one that can be used as a standard for all Contingency/Disaster Recovery Plans in your organization should be used. i.e., 1.0, 1.1, 1.2.1, 184.108.40.206, etc.
R. J. (Jim) Terry, CDRP, has been in MIS for 29 years. He works for Dyncorp in St. Louis, MOPrinted In Spring 1995
I. PREPAREDNESS IN THE 90s
reparedness is the hot topic in the 90s. Nationwide, there is a growing interest in both public policy and the private sector to put into effect the tools needed to assure readiness when disaster strikes.
Public policies are being written to mandate a working plan be in place at all levels of government. Corporations are realizing that a plan must move beyond the traditional approach to center on far more than the data processing areas.
Our reliance on electronics forces us to be realistic about what can go wrong. A major catastrophic event can cause many companies to suffer long-term recovery problems in the absence of a well developed and executed plan.
II. WHAT IS A DISASTER?
I offer a simple definition by Susan Bulgawicz and Charles Nolan. They define a disaster as; “An event whose timing is unexpected and whose consequences are seriously destructive or simply an unfortunate event.”
The key elements are:
- Significant destruction
And lastly sometimes it involves a lack of foresight or planning. Since no one knows when an event will occur we must be ready.
III. THEREFORE WE PLAN
In the last few years disaster recovery planning, crisis management, contingency planning, risk analysis and business resumption activities have grown steady and are at an all time high.
Basically everyone is trying to develop or refine a plan that has been approved by management, implemented, and periodically tested and understood by all the key players involved in the plan.
Once the top priority of safeguarding human life has been achieved the focus of attention becomes the mitigation of damage to property and the immediate resumption of business operations.
IV. THE ROLE OF THE RESTORATION INDUSTRY
While these plans are complete with many layers of concerns, we can now focus on the role of the restoration industry and how key vendors through pre-planning can become instrumental in assisting you in a rapid return to business as usual.
As an overview, the types of services available are as varied as the types of dilemmas you may face. These range from simple broken pipes to major hi-rise fires to massive earthquakes and major hurricanes.
There are contractors and vendors ready to retrieve vital data from heavily damaged electronic data processing centers. There are hot site vendors, high tech electronic cleaning experts, book and document drying, cleaning experts and odor control specialists. There is a general contractor that offers “Hyper-Speed Reconstruction” that can rebuild a vital structure in 1/4 to 1/3 the normal time. You must know your needs to know your resources.
V. LOCAL, REGIONAL AND NATIONAL RESOURCES
There are usually a very good group of skilled and capable vendors for most small to medium sized occurrences in your immediate area. These vendors are very responsive in terms of a cost effective solution to most problems. Likewise your own in-house facilities personnel may be able to handle most minor events unaided.
Regional contractors can provide the next logical layer of services in larger single events - that is events that effect only your operations - such as fires, or water damage situations.
However, when the event is on a larger scale or regional in nature such as a large flood, an earthquake or hurricane, the local and eventually regional resources will become tapped, committed and unavailable.
Local and regional contractors will become short staffed, over-booked and under equipped to respond to your immediate needs. At this point the large national vendors become the most reliable resource.
Larger situations require experience that local and regional vendors may not have. In short, knowing the capabilities of your resources, knowing when to call and knowing what kind of a response is realistic and available is key to your early recovery.
VI. WHY PRE-QUALIFICATION IS GOOD PLANNING
Pre-qualification is the buzz-word of the 90s in the restoration industry. While the concept is not new the need to be ready has pushed it to the front burner for planners.
There is a growing movement to include key vendors and contractors in the development and testing of a recovery plan.
In a confidential format, vendors and corporations are meeting to stage mock disasters that allow each player to understand the priorities and capabilities of each other.
Agreements are being made to exchange names, phone numbers and key procedures in order to assure a minimum amount of delay and confusion when an event occurs. The reduction in response times allows for a reduction in down-time. These industries are generally not listed in the yellow pages, so doing your homework pays big dividends.
A good example of how pre-planning can work at the time of a major disaster occurred October 17, 1989 in San Francisco, Calif. My home phone was ringing almost immediately. The first three calls were from established relationships needing immediate help. Our first 10 responses were with pre-qualified relationships.
Companies with major problems had a commitment to be served quickly. Consequently they were among the first companies to be on the road to recovery.
VII. ADVANTAGES OF PRE-QUALIFICATION
1. Reduce down-time and gross loss.
2. Speed return to business recovery.
3. Save equipment and information vital to organization.
4. Demonstrate care and concern for employees and customers by pre-planning.
5. Streamline insurance concerns and cost factors by controlling how decisions are made in advance of a confusing situation.
6. Simplify coordination of key players.
VIII. WHAT TO LOOK FOR IN A RESTORATION VENDOR
Now that we have discussed what our goals are and have investigated an overview of what is needed and available, how do you finalize who will be your resource in the event of an emergency?
Lets look at some important considerations when selecting a restoration company.
1.Commitment to be available 24 hours a day for emergency response. What good is a resource that you cannot get in touch with?
2.Response time - Within hours a representative should be in route or at the site to meet with key people.
3.Capabilities - What resources can they bring to bear and in what force.
4.Strength (financially), age, experience in the industry in dealing with similar situations.
5.Track record - References.Research the background of the company. The strongest recommendation is when someone who has been through a situation says they would call the company again and again.
6.Testing through meetings, exchanges of information and mock emergencies. Results of this procedure could allow you to become familiar with a company in a pre-loss environment.
IX. A WORD ABOUT THE INSURANCE INDUSTRY, RESTORATION AND YOUR RECOVERY
Risk Managers, controllers, financial officers or the president of your company will act to contact the agent or broker at the time of a loss setting into motion a chain of events.
The broker or agent will call a claim center for the carrier of the insurance. The claims center will assign a company adjuster or an independent adjuster to report to the loss site as soon as possible.
Meanwhile at the loss site you should move ahead with documentation and mitigation of damages.
Most insurance policies read “you, the insured, are responsible to mitigate damage pending the arrival of a representative of the insurance company.” When the insurance people arrive, they will work with your staff and your vendors/contractors to develop a scope of services to address the damages.
This scope will take into consideration such things as priority needs, and cost to replace. Other factors to review include the age of the equipment, down time awaiting replacement and availability of replacement.
Other insurance issues to be aware of:
Deductibles - When it is high, it transfers the weight of the decisions onto the insureds. When damage is great and beyond the deductible. The insurance industry has a greater role to play in how to accomplish the restoration/repair concerns.
Final decisions are the insureds to make. The insurance companies will guide you, but not make the decisions on your behalf.
In a regional disaster you may be on your own for 72 hours or more. The insurance industry will set up offices to respond as quickly as possible, but you may have to make some important decisions on your own.
In a planning environment we all have the opportunity to research the resources that are available. The efforts that we expend before a loss occurs are always rewarded at the time of a catastrophe event. There is a wealth of information and an army of professionals ready to assist you when you need it most.
Getting ready for a disaster is a job that cannot be put off for any reason.
Jim McGovern is regional manager of marketing and sales for M.F. Bank Restoration Company. He has more than 15 yars of experience in both construction and restoration.
This article adapted from Vol. 4 #4.
The first thing you need prior to development of a detailed Business Resumption plan is to obtain senior management commitment in verbal and financial forms. Without management sign-off, you will be wasting a lot of time documenting a plan that most probably will only be a dust collector (or a paper tiger).
Now that you have senior management convinced that a disaster plan is necessary, how about line managers. Many line and/or function managers are so busy with day-to-day operational activities that Business Resumption Planning is low on their priority list. You must obtain senior management’s support during meetings with line managers to convince them of the importance of Business Resumption planning.
The second item of business would be to select a contingency planning officer with significant company stature. This move would send the proper communication to the remaining line managers that senior management has placed a high level of importance on disaster planning.
Be sure to involve all critical business units in the disaster planning process for your company. Provide the forum for the various business representatives to discuss and document their plans and share them with all the other units. This methodology should produce a better coordinated disaster plan.
All business units should be operating from the same corporate wide Basic Assumptions to ensure continuity is Business Resumption planning. Plan for the major/worst case disaster.
If planning for the worst case scenario, anything of a lessor degree should be covered by your DR Plan.
- The Computer Center has been completely destroyed along with all equipment and documentation.-Backup tapes and documentation are stored off-site.
Many employees are injured or deceased.
All data processing support areas have been destroyed.
- Trained employees familiar with the critical business functions will survive the disaster to implement the Business Resumption plan.
- Telecommunications network control has been completely destroyed.
Each business function should develop their own assumptions/constraints to outline the environment that you may be operating under (more details than the Corporate assumptions/ constraints). You should key off on the Corporate assumptions/constraints.
Development of a comprehensive outline of assumptions/constraints will not happen overnight. This process evolved during time and should not be cast in concrete. When systems and procedures change, you should review the impact on your Business Resumption plan. The Basic assumptions/constraints of establishing your primary/critical functions/operations in the correct priority order is extremely important. If you don’t have your critical operations in the right sequence, you could jeopardize the business recovery efforts.
What are the basic objectives of sound Business Resumption planning?
Reduce to a minimum the probability of critical/essential services to the customer and ensure financial stability during the recovery phase of the disaster
- Provide a real sense of security.
- Reduce risk of delay or inability to operate.
- Ensure the reliability of backup systems.
- Provide a standard for plan testing.
- Minimize decision making time frames during the disaster.
You should identify the major costs associated with the positioning of your company to survive a major disaster and obtain senior management approval for the expenditures. Without the expense commitment up front, your Business Resumption plan may not be worth the paper it’s written on.
Ensure that your critical equipment vendors know what would be expected of them in a major disaster (get it in writing).
Review your critical operations and determine what equipment is essential to performing them in a disaster situation. Will you need more or less equipment or maybe different types of equipment under the disaster scenario. Meet with the equipment vendors and discuss your expectations and what they can accomplish in a disaster mode. Have vendors document their deliverables to the organization.
What helps get management’s attention is a real disaster (Hugo, earthquake, etc.) with real consequences and outcomes. Build on these situations.
Don’t just write a Business Resumption plan to satisfy regulatory agencies; do it to improve your company’s chances of survival.
Lessons learned from actual disasters (major or minor) prove invaluable for future development of workable disaster plans. We are becoming more and more aware that disasters can and do occur and we must provide for contingency plans to survive as an entity.
It’s not well enough alone for your Data Center to have a disaster plan if it does not interface with the individual business units of your company. It’s like building the Data Center plan in a vacuum. It may look good, but it won’t work. The most critical phase of disaster survival is the management commitment of human and financial resources to preposition your company to survive.
- INTRODUCTION - What is it you are attempting to accomplish and what are the basic parameters?
- What will be your initial response to the disaster?
- Your Contingency Operations will inform you on how the critical operations of the business group will process during a disaster. It won’t be business as usual.
- The Restoration Operations directs the recovery efforts from disaster assessment through the restoration of your original site.
Note: The team approach will work to your company’s advantage. (Review attached disaster organization chart.) Name your team leaders and individual team members prior to the disaster.
-The Management Support phase provides for the necessary administrative support during a disaster.
-The Maintenance and Testing portions of your plan speak for themselves. If you don’t do maintenance and testing, your plan will most probably not work.
For example, something as simple as an employee emergency contact list that is outdated will cause serious problems during initial phase of a disaster.
-The Training of Employees phase may be one of the most important steps you take to survive a disaster. Do it on a corporate level.
- Greater Disaster Prevention efforts will provide some reduction in potential disaster risks.
- How are you going to do your business when your place of business is gone/destroyed?
- Where are you going to relocate your critical business functions?
- How are you going to replace your destroyed records?
- How are you going to replace your destroyed equipment?
- How long will it take for you to relocate your new business?
- Where are you going to obtain employee replacements in a hurry?
- What would be the most devastating time frame for your disaster to occur? Plan on it!!
If you can’t answer most of these questions, you may not survive a major disaster.
Let’s spend more time preventing or reducing the potential of a disaster. "An ounce of prevention is worth a pound of cure."
- Does the building that you occupy have sound and well documented fire/emergency/safety procedures? Are fire drills conducted on a regular basis?
- Does the building have a sprinkler system?
- Are employees familiar with proper procedures to follow in an emergency?
- Do some employees have training in first aid emergency assistance?
- Do your telephones have emergency numbers recorded and readily available in all areas? (Building Security, 911 if applicable)
- Do you have bomb threat procedures?
Procedures will improve the chances of survival.
- Take ownership of development of Business Resumption Planning or it may not work when you need it most.
- Test, test, and retest your Business Resumption Plan until it becomes second nature to your BUSINESS RESUMPTION Team.
- There is not a company in the country whose plan for a major computer and business disaster will provide the same customer service levels that exist under normal conditions. (Data Center/Service Center and major business units)
- How long do you have to start processing your operations? That depends on the type of business you operate. (Banks two days maximum)
Communications can be your number one friend or number one enemy depending on how you use it during a disaster.
What if your telephone system was inoperative or inefficient during the disaster. How would you communication with the outside world?
Would cellular telephones work? Should they be assigned prior to disaster or after the disaster?
What about radio systems (with your own channels) as an effective method of communications?
How about beepers on key personnel?
Communications, internal and external, are critical to the very survival of your organization.
Produce your internal and external communication contact lists based on the most critical first. External contacts include vendors, customers, regulatory, suppliers and others.
With sound disaster pre-planning your communications will be more timely, effective, and efficient during the actual disaster. Be sure that critical fax numbers are known and documented by team members and other key personnel (internal and external).
Test your internal and external contact list on a surprise basis and document the results.
Crisis Management Team
To provide general leadership and direction during all phases of Disaster Recovery.
To evaluate disaster data received from the restoration Team Leader (Disaster Assessment Team) and decide if a disaster exists within the company. If the Crisis Management Team declares a disaster, the notification process and the Business Resumption plan (portions needed to survive) would be activated.
Disaster Assessment Team
To collect all pertinent information about the disaster and report findings in a timely manner to the Crisis Management Team.
Contingency Operation Team
Various business functions. Also, manage all of the critical operations during a declared disaster.
Management Support Team
To establish the Business Resumption command post and provide administrative support to the Crisis Management Team. Personnel issues (coordinated with Human Resources) are handled by the Management Support Team.
Restoration Operation Team
To identify the general activities required to restore the original business function. It directs the restoration efforts from disaster assessment through the restoration of the original business site.
Alternative Site Team
Responsible for the preparation of the temporary site to continue Operations (facility/ equipment/ furniture telephones). Coordinates with real estate, General Services, equipment vendors, and Telecommunications personnel.
Responsible for the transition to and from the alternative site.
Disaster Site Team
To monitor the restoration of the disaster site. Activities dealing with facilities and equipment will be handled with applicable vendors.
- Contacting personnel at the onset of a disaster (on the direction of the Management Support Team Leader).
- Maintaining an ongoing status of personnel (dead/injured team member or reserve).
- Maintaining a resource pool of all personnel (i.e., inactive in the recovery, but on call for participation.
- Coordinate all travel/lodging arrangements.
- Provide clerical staff to assist Crisis Management Team.
- Coordinate petty cash issues.
- Mealtime arrangements, if necessary.
- Coordinate payroll issues.
Some Key Points Regarding Business Resumption Organization and Teams
- Assign team leaders and members prior to any actual disaster.
- When possible assign team members by functional position instead of individual names.
- Ensure that all team personnel are familiar with their Business Resumption responsibilities.
- Conduct periodic reviews with team members to update BUSINESS RESUMPTION plans and perform walk throughs.
- Your Disaster organization should closely resemble the business organization that gets the job done on a daily basis.
Business Resumption Team Structure
Close coordination between all teams is absolutely necessary for a successful disaster recovery.
Other special type teams may be necessary depending on the type of business that you are developing the disaster plan for. All business units and the Data Center should coordinate their BUSINESS RESUMPTION planning activities to ensure that critical operations are fully covered.
It’s not how much money you spend in development of your Business Resumption plan that counts, it’s the main question of "will it work?". Build a workable plan that can be readily used in a real disaster, If the Data Center Computer Operations does not position itself to survive, the business units will most probably not survive.
Why Have a Business Resumption Plan in Your Organization
- Survive as a business entity.
- Maintain financial stability.
- Survival of various business functions.
- Continuation of critical business operations.
- Minimize the potential for loss given a business interruption of significant magnitude.
- Loss of customer base due to deterioration of service levels.
Did you notice that regulatory compliance was not included in the list of reasons for having a Business Resumption plan? Why? Because all the other reasons are much more important reasons for having a disaster plan that will work!
When all else fails the Business Resumption coordinator mails out his resume and moves on to greener pastures.
A Business Resumption coordinator that has done his job successfully will have little to do during the actual disaster. Review your insurance coverage and determine if you can recover with your current coverage.
Where you used to think of your business resumption plan as a type of insurance coverage, now think of it as a competitive business advantage.
Richard Piellucci is the Service Center Business Resumption Planning Coordinator for First Union National Bank.
All too often, Recovery Planning projects begin with great expectations but end in disillusionment. Cost overruns, missed deadlines, staffing problems, and a plan that is outdated even before the ink is dry, are just a few of the pitfalls that can plague the unwary Recovery Planner. For the benefit of those responsible for managing Recovery Planning projects, this article will first review the various obstacles that may be encountered during a typical project, and then review how the Recovery Planner can overcome these obstacles, and ensure that the project meets expectations.
It should be emphasized, however, that this article is not about Recovery Planning; it is about project management. While the information will be presented within a Recovery Planning context, the problems and solutions to be discussed are not unique to Recovery Planners. Anyone responsible for managing a complex, multidisciplinary project, regardless of its purpose, may face the same problems, and can employ the same solutions.
The fundamental goal of project management is to ensure that the expectations of management and the project's sponsors are met. These expectations are usually quite straightforward. In the case of a Recovery Planning project, management and sponsors will undoubtedly expect three things: they will expect the plan to be completed; they will expect it to be completed on time and within budget; and they will expect it to be workable.
Once upon a time in the very distant past, disaster recovery planning was not something that you worried about. That is, if you thought about it, or even heard the terminology. Things were basically taken for granted, (that) someone was looking after 'that', or 'it' will take care of itself. No one lost any sleep over things like: loss of communications service or data files, building inaccessibility, and the like. These things just didn't happen. Welcome to the real world. Now, people not only lose sleep but could very well lose their jobs - over the very things once taken for granted.
So let's assume that your company has a disaster recovery plan which has been reduced to writing. How then do you ensure that the plan would not gather dust or become hopelessly outdated?
The problem I have found with disaster recovery planning exercises is that there is no repeated emphasis once the initial flurry of activities subsides. After the plan is completed - more often than not - a company's internal operating practices do not reinforce disaster recovery awareness. The challenge, therefore, is how can you ensure that this awareness is maintained? And where should this awareness program begin?
Let's look at where the opportunities to broadcast the company's attitude towards emergency preparedness can be found. Four readily identifiable areas come to mind which offer the greatest visibility for reinforcing the company's disaster recovery planning policy and strategy:
 Personnel orientation
 Staff reviews
 Departmental annual budgeting
Starting with the receptionist: I have often phoned companies requesting to speak to the person responsible for disaster recovery planning, only to be asked which department would that person be in. More often than not I'm told, - 'Well I'm not sure but let me transfer you to ...so and so... who should be able to help you.' This is usually followed by another transfer, and if I haven't by this time fallen victim to 'accidental disconnection'...(ooops, didn't you just call), I finally speak to someone who understands what I am enquiring about.
Try it out within your own company. Have someone phone anonymously and ask for the disaster recovery planning officer, and see how quickly they get through to the correct person?
Your receptionist should not only know who is responsible for disaster recovery planning, but should have at his/her fingertips the names of the individual(s) to whom all questions relating to disaster recovery planning should be directed. Why keep the listing of disaster recovery leaders, recovery/restoration team members etc., buried inside the disaster recovery planning manual? Include these names and functional responsibilities on the internal telephone directory listing, prominently, and for all to see and know.
And don't stop with the receptionist. Make the company's disaster recovery program a part of the new employee orientation program. To accomplish this, include a separate and noticeable signoff sheet, to be returned by the new employee, which clearly states that they have received and read the plan prepared for the department/area in which they will be working, will participate in recovery planning exercises from time to time, and will continue to have knowledge of and participate in the preparations of the recovery plans for all future departments they may work in within the corporation. This will not only alert the employee to the seriousness of this topic, but will further serve to reinforce the company's commitment to business continuity planning.
Human Resources department on the other hand, should not be concerned with having to keep current copies of all departments' plan. They should be able to call the respective department and request a copy of their most current plan (which should be dated) as and when needed. When it is received, it is included in the orientation kit and passed to the new employee.
We all know that some of the things signed at orientation are seldom if ever looked at again or remembered for that matter. To ensure that this does not happen to the disaster recovery signoff sheet, include disaster recovery as part of the annual employee review process. This could be accomplished by including a section, on the employee's evaluation form, which asks (for) what activities has the employee engaged in to support the company's disaster recovery effort. This section should not be used as a grading tool; rather it is to ensure that the employee maintains an interest in the disaster recovery planning efforts of the company. Neither is the quantity of activity important - it could simply be that the employee participated in an offsite test of his/her department's recovery plan. The point is, this section should not be left blank - the employee should have done something and more importantly - know why it was done!
Finally, make managers accountable. Include as part of the annual budgetary review process, a signed statement by each department head that he/she has in place an updated disaster recovery plan, in line with the company's policy, that the staff under his/her direct responsibility are aware of such plans, and are in adherence to the requirements - that is, maintain current software, equipment configuration, supplies and records offsite, etc. - and has conducted at least one general staff session, to discuss the department's disaster recovery plan, or a test of the state of preparedness of the department.
By now you would have concluded that the intent is to ensure that disaster recovery awareness follows an employee throughout the organization. And the only way to ensure this is to include it at the points where they are likely to be most visible to the employee.
Furthermore, if buildings require prominent postings of Fire Warden signs, then why shouldn't companies insist on a similar posting to identify their recovery personnel? Maybe the day when the building's list will be expanded to include a section for a company's personnel is not too far off! Now this does not mean that people will automatically read it, but [a] they will know where it is, [b] if posted close to elevators the chances of it being read while waiting for an elevator is greater, and  there is the possibility that it will be read at least once a day.
A note on credibility of listings: Any list which is not current to within one month will lose credibility faster than an elected politician. Therefore assign the task of ensuring the currency of these lists to a specific area ( I would start with the mailroom).
This is not to advocate more work for an already lean staff. On the contrary, this should be so automatic that it becomes second nature. Too many plans are left to wither, to the point where when they are found, they bear little representation to what is currently in place. Like any good contract, the things to pay attention to are not the ones which are obvious now, but those which will haunt you in the future.
Again, the important points to remember if you want to maintain disaster recovery/continuation awareness throughout your organization, year round, are:
- Include it as part of the orientation package.
- Make it a part of the overall annual budget process.
- Include a recovery personnel section as part of the internal telephone directory listing.
- Make all department heads responsible for timely advise of changes.
- Ensure that teams and members are prominently posted... not stuck on the cafeteria bulletin board, and hidden below...Looking for one bedroom apartment possible share...
- Include it as a part of the employee annual review process.
And tell the receptionist.
Franz McConney, CDRP, is President of TVI Corp. in Englewood, NJ.
The time has finally arrived for business continuity planning professionals and the industry to abandon several misconceptions about senior management's views and attitudes towards business continuity planning. Such misconceptions include, but are not limited to, the following:
1. Senior management believes that disaster only happens to the 'other guy!'
2. Senior management is not aware of current catastrophic events and their impact on businesses (even though they all have CNN and other national news sources).
3. Senior management is unaware of Murphy's Laws!
4. Senior management is only interested in meeting fiduciary and regulatory requirements!
5. Senior management will not allocate adequate budget for business continuity!
The need for training business continuity teams is well recognized by the business continuity industry. Certification courses, seminars, professional practice standards, and other authoritative bodies explicitly state requirements for training business continuity teams. However, these prescriptions call for training those individuals who have been charged with the development of the plan. Furthermore, the recommended training appears to be restricted to the plan development phase.
The fact of the matter is that business continuity teams need training at four distinct phases of business continuity plan development process, namely pre-planning, planning, post-plan development, and pre-exercise phases. On-going training and education is imperative for those individuals in an organization, who will be involved not only in the development and implementation of the business continuity plan, but also in exercising, evaluating, maintaining, and executing the plan.
In this paper, we present approaches to provide training for business continuity teams during all of the above four phases of business continuity planning. We provide the desired elements of training as well as the target audience groups for each of these four phases. We also emphasize that many of these training requirements may need to be met on an on-going basis, rather than a one-time effort.
Your organization’s contingency plan documents have been assembled and distributed. The important parts of the business have been identified and contracts established for alternate sites or services. Now, if some natural or man caused event interrupts business, how do we ensure all this work will be used correctly? In major disasters, storm, flood, or fire type events dictate required implementation of planned alternative arrangements.
However, the more likely occurrence is a less than total resource loss such as failure of a critical computer or support equipment, loss of telephone service, or a small fire disrupting one department. In these situations, some structure is needed to determine if any of the pre-arranged alternatives are needed and if so, to what extent. This process has been called “damage assessment,” but it should be more. In the execution of a contingency plan, the assessment should be the transition between the end of the emergency (people safe, assets secured, etc.) and all actions to be taken next. The main goal of the assessment meeting should be to make sure those activities which are essential to the business are continued in some form and the appropriate actions are taken to ensure this.
In the evening of Wednesday, January 15, 1992, Bluebonnet Savings Bank (BSB) in Dallas, Texas, got to demonstrate first-hand a key DR maxim: a “disaster” should not be thought of only as an external event that strikes computer operations. Rather, a disaster is anything that interrupts the continuity of business operations. And when disaster struck, Bluebonnet was ready.
The 3725 is Down!
That evening, at this multi-billion dollar bank (with 34 branch offices spread around Texas and a mortgage servicing company in Atlanta), MIS operations came to a halt. An attempt to re-IPL the bank’s IBM mainframe failed when the 3725 communications controller would not load. In addition, operations was experiencing problems with bad tracks on the disk drive.
Like most financial institutions, communication with branches and customers is key to continuing effective business operations at Bluebonnet. Anything that removes that communications link is disastrous. “We have to be able to allow customers to withdraw money, get information on account balances, and the like. You just can’t tell people that they can’t withdraw money because you don’t know how much they have in their accounts,” says Chuck Littleton, Disaster Recovery Planner for the Bank. “So it is standard policy for us to declare a disaster on anything that will knock us out for 24 hours or more.”
Therefore, when it became obvious that the problem was not going to be fixed immediately, that is exactly what the bank did. Bluebonnet Saving’s Bank declared a disaster with their IBM hotsite in Tampa, Florida, and activated their business contingency plan, automated with Strohl Systems’ LDRPS software, at 4:15 p.m. on January 16.
By 8:00 that evening, key bank employees were on a plane to Tampa, and by 12:15 a.m. they had begun recovery operations. At 6:00 a.m. the Tampa alternative site system was up and running successfully with all databases loaded.
Back in Dallas, recovery was in progress. By 3:00 a.m. the same morning, the communications controller had been brought back up. “After testing it and solving some communication problems with a few of the branches, we were able to determine that we could switch operations back to Dallas, and we did so at 9:00 a.m. In fact, we were only running live at the hot-site for about three hours,” says Littleton. “But if the problem in Dallas hadn’t been solved, we were ready that Friday morning to be in full operation in a way that would have been transparent to our branches and customers and in a way that would have preserved the continuity of business operations.”
The role of the Plan
Having the hot-site agreement in place was key to Bluebonnet’s ability to react and recover quickly. But just as important, noted BSB’s Disaster Recovery Coordinator Patti Smith, was having an automated business continuity plan that the bank had developed last September.
“We realized that in the event of a disaster, there was a lot of information that we would need to assess quickly,” says Smith. “Things like the names and phone numbers of people we needed to contact, organizational plans, task plans, equipment inventories, and the like. That kind of information is critical to have at your fingertips if you are going to keep doing business and servicing customers.”
So last fall, using plan development software, Smith and the unit managers automated the bank’s recovery plans. They analyzed the needs and functions of their business units and collected the information necessary to ensure the continuity of each key business function in the event of a disaster. “It was the availability of this data from the database that allowed us to react so quickly and efficiently,” says Smith.
The real World Test
“Because we were actually up and running again by 9:00 a.m. Friday in Dallas,” says Littleton, “this experience served as a thorough test of our disaster recovery and business continuity plan.” And there are several key lessons that both Littleton and Smith point to as a result of the experience.
“First,” says Smith, “you absolutely have to have an automated planning tool in order to maintain the data that is needed to effect the recovery process efficiently. There is simply no way, realistically, that anyone could control and update that much information in a simple written plan.”
Littleton adds, “The second lesson we learned is that it is so critical that the data in the database be current and valid that we will now update our continuity plan on a daily rather than a weekly or monthly basis. All staff changes, CPU or other equipment configuration changes, etc. will be input to the database immediately. It has to be current.”
Finally, both agree that the position of Disaster Recovery Coordinator, Smith’s function, must be made clear and the lines of communication kept open for all who are in any way involved in the recovery. “It is really important in order to minimize confusion,” says Littleton. “We had far too many people calling all over the place to ask questions when they should have been dealing directly with Patti. But we’ve cleared that up now. If anything like this ever happens again, everyone knows that Patti is ‘central control’ for all information regarding recovery operations.” In fact, Bluebonnet Savings Bank now regards the position as so important that Smith has been assigned an assistant.
Although Dallas was back up and running on Friday morning, the disaster recovery team that had flown to Tampa stayed on over the weekend to troubleshoot the problems with the modems and communications lines. They returned on Sunday night, tired, but justifiably proud of a job well done.
This time, the disaster was short-lived. But the experience was an important one. It allowed Bluebonnet Savings Bank to test and refine, under fire, the value and quality of their contingency plan. If there ever is a “next time,” they will be prepared.
Mary Lou Roberts is a free-lance writer and industry consultant with more than 25 years of experience in information systems.
This article adapted from Vol. 5 #2.