Let’s say that in a best-case scenario, you won the lottery and gave your two-week notice to quit your job. Three weeks after you quit, you’re in the Bahamas drinking one of those drinks with a little umbrella, soaking up the Caribbean sun, when your former organization experiences a disaster. Your replacement, Dave, now has to recover the critical application systems. Dave has the same qualifications, education, and pretty much the same experience as you. However, Dave is new to the organization, has never worked with your systems before, and now has to recover them using your recovery procedures. It is your responsibility to write your recovery procedures so that Dave can successfully recover the systems.
Now let’s say you didn’t win the lottery. You’re still going to work every day, punching the same clock, working on the same systems. One night, you get a call at 2 a.m. stating your organization’s building has been destroyed by some type of disaster – a tornado perhaps. Your office, your data center, everything from your job was literally blown away. You have been instructed to pack a bag and get to the recovery hot site as soon as possible.
When you arrive at the hot site at 5 a.m., your back-up tapes and off-site documentation is already waiting for you. The CIO is also waiting for you and tells you the organization is going to face costly lawsuits, federal fines, and probably go under if you do not get the critical systems restored. No pressure there, right?
You know how to do the recovery in your sleep. So it shouldn’t be a problem, right? Wrong. Some of the critical file names changed last week, and you can’t remember the exact names to use when issuing recovery commands. You now have to dig through your off-site documentation and find your recovery procedures to get those file names. Fortunately for you, you not only had the current file names listed, you also had them bolded so they were easy to read and identify on the pages.
There are two basic formats that can be used to write recovery procedures: background information and instructional information.
Background information should be written using indicative sentences, while the imperative style should be used for writing actual instructions or commands. Indicative sentences have a direct subject-verb-predicate structure, while imperative sentences start with a verb (the pronoun “you” is assumed) and issue directions to be followed.
Recommended background information includes:
- Purpose of the procedure
- Scope of the procedure (i.e. location, equipment, personnel, and time associated with what the procedure encompasses)
- Reference materials (i.e., other manuals, information, or materials that should be consulted and stored off-site)
- Documentation describing the applicable forms that must be used when performing the procedures (i.e. declaring a disaster or requesting delivery of off-site tapes)
- Authorizations listing the specific approvals required
- Particular policies applicable to the procedures
- Separate instructional information into separate headings that are common to each page of detailed procedures. Headings could include:
- Subject category number and description
- Subject subcategory number and description
- Page number
- Revision number
- Superseded date
Procedures should be clearly written. In some cases, procedures can even be “boiler plated” or “fill-in-the-blank.” They would then be later modified with specific information. This would be useful in an organization that had several critical databases to recover. Boiler plated recovery procedures could be used to develop individual recovery procedures for each database. The blanks would be filled in with the specific database names, critical file names, or specific recovery commands.
Below are helpful tips and reminders for writing detailed procedures:
- Be as specific as possible. Write the procedures with the assumption they will be implemented by someone outside of your organization and/or department. They will be completely unfamiliar with the functions and operations.
- Use short, direct sentences and keep them simple. Long sentences will overwhelm or confuse the reader, especially at 4 a.m.
- Use topic sentences to start each paragraph.
- Use short paragraphs. Long paragraphs, just like long sentences, can overwhelm and even hinder comprehension of the instructions or information.
- Present one idea at a time. Two thoughts normally require two sentences.
- Use active voice verbs in present tense. Passive voice sentences can be lengthy and may be misinterpreted.
- Avoid jargon.
- Use position titles (rather than personal names of individuals) to reduce maintenance and revision requirements.
- Avoid gender nouns and pronouns that may cause unnecessary revision requirements.
- When issuing commands, type the exact command followed by a remark that tells why the command is being issued and/or what the expected result(s) will be.
- Consider the recovery person’s state of mind. Remember, they may have been involved in the disaster or had a significant other involved. Their state of mind may not be clearly focused.
- Make your procedures easy for the recovery person to follow. A likely scenario could be doing the recovery at 3 a.m. with very little sleep.
- Bulleted or numbered procedures that include columns for “completed by” and “date/time completed” help to serve as a checklist during the recovery period. Afterward they can serve as a log for after action reviews. Be mindful that they can also serve as evidence for investigations and/or lawsuits that may ensue.
- Bold or capitalize specific item names such as file names or IP addresses. They will stand out on the page and easier to locate in the heat of recovery (see Example 1).
- Use graphics and/or screen prints to illustrate difficult points or to differentiate between command/screen results and your procedure steps (see Example 2).
- Using tables is a good way to divide procedure steps – results, comments, or command line commands, error codes, action/re-actions (see Example 3).
- Develop a uniformity pattern when writing the procedures. This will simplify the training process and assist with procedure familiarity.
- Identify events that occur in parallel and events that must occur sequentially.
- Indicate dependencies between events and procedures.
- Use descriptive verbs. Non-descriptive verbs such as “make” and “take” can cause procedures to be excessively wordy. Examples of descriptive verbs are: acquire, activate, advise, answer, assist, back up, balance, compare, compile, contact, count, create, declare, deliver, enter, explain, file, inform, list, locate, log, move, pay, print, record, replace, report, review, store, type.
Scope and Planning Assumptions
Even though the most common scenario for a disaster recovery situation is barred access to the main or primary facility, the “worst case scenario” should be the basis for developing recovery procedures. The worst case scenario is typically defined as the total or significant destruction of the main or primary facility. Be sure to limit the scope of your procedures to your group’s responsibilities. It is too easy to address areas that are outside of your area of responsibility.
You should assume the disaster recovery plan will be followed as detailed. So you should assume part of the staff will be available to put the disaster recovery plan into action, and perform the critical recovery procedures when required. Assume your off-site items such as software, manuals, and tapes will arrive at the recovery site as planned. Other assumptions should be taken into consideration and documented. Since the assumptions usually drive the recovery plan and procedures, management should carefully review and endorse them.
Less Disastrous Events
If procedures are written based on the premise of starting over from scratch, other, less detrimental situations can be addressed by referring to the applicable portions of the procedures.
The organizational structure of an organization in recovery mode may not be the same as the existing organization chart. Therefore, a team approach is best used when developing recovery plans and procedures. Each team has specific responsibilities that must be executed in order to allow for a successful recovery. Well structured teams will also perform better when recovering from an actual disaster. Each team should have a primary leader and two alternates designated. These persons will provide the leadership and direction required for developing recovery procedures and implementing them during a disaster.
Potential teams include:
- Management team
- Business recovery team
- Departmental recovery team
- Computer recovery team
- User support team
- Computer back-up team
- Off-site storage team
- Software team
- Applications team
- Computer restoration team
Various combinations of the above teams are possible, depending on the size and requirements of the organization. The number of members assigned to a specific team can also vary depending on need.
Be sure you have a mechanism in place that will allow anyone on your recovery team to perform the recovery procedures. It may be necessary to elevate a user’s privileges in order to issue specific recovery commands.
For scenarios in which no original team members are available to perform the recoveries, or if your plan includes emergency hiring of contractors to perform recoveries, it may be necessary to have in place a user ID(s) and password(s) that is to be used only during a recovery. This ID would have full administrative privileges. It should be created in advance and be part of your recovery plan. It could be safe-guarded by sealing it in an envelope and storing with your off-site documents.
The benefits of effective disaster recovery procedures include: elimination of confusion and errors, training materials for new employees, and reduced reliance on key individuals and functions (single points of failure).
Chadwick Taylor, CBCP, CRP, is a contracting consultant for EDS at the Farm Service Agency, a division of the U.S. Department of Agriculture. Taylor is also the assistant CERT chief for the Kansas Army National Guard.
Information from a three-part series titled “Disaster Recovery Planning Process” written by Geoffrey H. Wold for Disaster Recovery Journal in 1992 was used in this article.