Best practices are typically thought of in a technological sense. The thinking is that there is one best way to implement a technology solution. However, the identical technology, such as a web site, will have two completely different 'best' ways to be recovered depending upon the business function. Is the web site your entire reason for existing, like a dot com company, or is it an adjunct to traditional processes, like class registration at a college? Obviously, these two organizations will have different technology strategies, and budgets, for business continuity.
Best practices have a little to do with technology, but they have a lot to do with process. The Y2K crisis and the 'e-business recovery crisis' are identical events - they are risks to your continued operation. There are new risks to your organization every year. Yet, many organizations continue to treat each risk as a one-time event. After the risk is dealt with, the interest in the risk-reducing process wanes. The employees who worked on business continuity move on, and no one takes their place.
Once a new risk hits the radar screen there is a time-consuming, and sometimes expensive effort to revive contingency planning. New staff has to be found. The proper experience may no longer be available in the employee pool. Long-time employees may become jaded to the planning process, because it never seems that anybody cares about the work. Worse, something has to be done that was not budgeted for this year.
So, what are best practices? Best practices boil down to a commitment to an ongoing business continuity planning process. It takes surprisingly little executive and management effort to keep the process alive. It only takes a genuine concern.
In the rest of this article I present what I consider the nine best practices for business continuity. The payoff for implementing best practices is that it is the cheapest way to deal with recovery planning. Starting at the top:
1. The Board of Directors annually reviews the business continuity program.
Directors have responsibility for the protection of the corporate assets and long-term survival of the organization. The buck stops here. If the board does not ask for business continuity it is difficult to sustain a program.
There is often a policy for business continuity in place, especially in very large organizations. But, the divisions in the field can and do ignore the policy because the people at the top never ask how the policy is being carried out. That is why it is important for the board to ask for a status report at least once a year.
How do you get a board to care? An executive can make them care by selling the concept of risk reduction to them. A good business continuity program, well presented, will win over a board and win support for the program.
2. The responsibility for business continuity rests with a top executive (CEO or COO).
Without executive responsibility, the business continuity process does not have enough stature to sustain it through tight budgets. It also requires this level of support to reach across departmental boundaries to get the job done.
Organizations implement new technologies (or buy/sell business units) without recovery planning at the outset because there is no executive asking about the issues. Conversely, once an executive is concerned about business continuity new initiatives will be reviewed for recoverability.
3. A distinct staff, with associated budget, performs the business continuity activities.
Without a dedicated staff to conduct the activities of recovery planning the activities will not get done. By dedicated I mean someone whose job description specifically states business continuity, even if it is a part-time responsibility. Staffing number depends on the size and distribution of the organization.
While the business continuity staff is responsible for the activities of the business continuity program, the department managers are responsible for the recoverability of their own department. Department managers, not contingency planners, should report the status of their recovery plan to executives.
The single biggest reason for the demise of business continuity programs is that no one owns the work.
4. The business continuity function spans all aspects of the organization.
All business processes must be recoverable (eventually) no matter what their dependence upon information systems. The recovery strategies must focus on the business process and not on the technology components of the process. It is not 'a data center problem'.
Typically, it is a people problem first, and a technology problem second. I have seen a large organization whose technology was recovered quickly after a hurricane disaster, but where none of the workers at the affected site were using any terminals. The plan, as far as it went, worked, but the people were not prepared.
5. Business continuity planning is a continuous process within the organization.
The cycle begins with a business impact analysis (BIA), followed by a review of the recovery plan to see if it meets BIA requirements, (documents a recovery strategy appropriate to the risk) followed by a technical review of the plan to see if it contains all the information to support the recovery strategy. This is an iterative process with a minimum of an annual cycle.
Sometimes there is a big push to complete a BIA, but it is done as a one-time project with outside experts. Once the experts leave, the next new business process and new technology are implemented but not included in the existing recovery plan. The BIA process has to be internalized and continually renewed or it will fail.
6. The organization maintains a comprehensive backup policy that includes all vital records.
This is a common oversight. Much of an organization's recovery still depends upon paper records, especially work-in-progress. The organization should be aware of the location of vital records and have an adequate protection program in place (fireproof cabinets, offsite duplication, clear desk policy, etc.)
7. Recovery strategies are in place and are based upon the impact that the loss of a business process would have upon the organization.
A BIA will point toward an appropriate recovery strategy. A recovery strategy appropriate to the risk should be in place. There is no one-strategy-fits-all 'best practice' recovery strategy.
8. A recovery strategy-testing program is in place.
Conduct test executions of the recovery strategies, at least annually, 1) to verify that they work,
2) to ensure that they are sufficiently documented, and 3) to train the staff in their execution. If backup tapes are involved for a technology test, the test is done with tapes from offsite storage only. Test includes mid-week recovery, with application of incremental backups (if applicable) and synchronization of applications. Manual business processes are tested using a structured walk-through technique.
One of the valuable benefits of testing is that it makes employees aware that there is a plan at all, and that they have a role in the recovery.
9. The recovery manual that documents the program is reasonably current and available under any circumstances. The document is structured so that an outside technical expert, unfamiliar with the organization, could execute technical recovery strategies.
I know of a company that was recovered by a hot site technician using only tapes and a recovery plan when the company's employee had to suddenly leave the hot site. I know this works. The issue with the format of the recovery plan is not whether it is web-enabled or not, or even if it still in Word Perfect 5.1. The issue is currency of the content, and access to the content.
So, what do these nine best practices mean in the real world? Thinking back to the organization I mentioned above that has a new web site-the contingency planner would have been in planning meetings for the new web site. She would have conducted a business impact analysis to determine the estimated value of the web site after a year of operation. The appropriate recovery strategy would have been implemented as part of the package when it went live. The cost of the recovery strategy would have been rolled into the budget for the new project and not added as an afterthought. Isn't that what you did for your web site?
Best practices are not about technology. They are about process. It means that you don't have to gear up the organization to think about a new problem every time. All new problems become 'business as usual'. That is the best practice.
What should an executive do to implement best practices? Simply acknowledge that risk reduction is part of your fiduciary responsibility. Simply say, 'I own this.' The rest will flow naturally.
Reinhard Koch is Disaster Recovery Product Manager at Strategic Technologies, Inc. He has conducted over 40 recovery planning consulting engagements, and has personally been involved in three declared disasters requiring hot site recoveries. He welcomes your comments at firstname.lastname@example.org.