Disaster recovery is something much different from insurance. While a thorough disaster recovery program does offer protection for the company’s “information assets,” unlike insurance, it is not something that can effectively be accomplished on a task basis. In other words, when fire, theft or any other insurance is procured, once the policy is placed the task is essentially completed, never to be revisited or worried about except at the annual premium payment time or upon having to file a claim after a covered event has occurred. This is not disaster recovery! To be successful, any disaster recovery program must be viewed as a refinement to the normal operating procedures; as an element of standard performance objectives; and as a fundamental part of the company’s operating culture. Properly presented, it is one of the cornerstones in any quality program, a key and critical contribution the MIS function can and should contribute to the quality orientation of the company. It rightfully belongs within organization’s “concern for quality” program, not under the insurance umbrella.
The Quality Approach
There are many ways for an organization to effectively implement a quality approach to doing business. Nonetheless, inherent in all approaches are three fundamental criteria:
- They all require a focus on the customer
- They require a process orientation
- They assume data driven decisions
MIS has become irreplaceable in the quest for quality, especially in these three areas. To truly focus on the customer, companies need to have a wide variety of contacts with their customer base, an effective method of understanding each customer’s history to give the data perspective, and a method for evaluating the data to give it relevance. These are, for the most part, gathered, compiled, analyzed and retained via computer applications. Key buying trends, credit limits, inventory availability, delivery tracking are all further examples of on-line systems that are customer focused and without which most companies cannot operate.
Equally as critical to the successful quality program is a process orientation. The processes critical to successful transactions largely lie within and are controlled by our computer systems.
Finally, if decisions are to be made that are data driven, the facts and data must be available and, in today’s fast paced environment, they must be available quickly and accurately. We have all come to rely exclusively on our computer support for critical data on a timely basis.
The nature and importance of MIS’ role in the company’s success or failure is fully emphasized by recognition that data processing has subtly evolved from once being a performance enhancing tool through the key element stage and now resides clearly within the critical functions area. Without it, most organizations very quickly lose their ability to even be in business. It is no longer just nice to have, nor merely an important support tool. MIS has become the nerve center of the company. This evolution has occurred incrementally, for the most part, and the frequent result is that senior management is far more dependent on data processing than they may realize or may even be willing to accept.
Combined with this subtle evolution is the problem of the “magic” in data processing. It is understood by so few that often senior management would rather not deal with it directly, but find a home for it organizationally that allows them to continue to view it as a support organization somewhat akin to the personnel department. To them, automation is important, to be sure, but not a mainstream function - more of a necessary evil.
MIS management must rethink their disaster recovery reasoning and position disaster recovery rightfully within the Total Quality program and no where else! Next is to clearly separate it from this “insurance” concept. It is not simply insurance. A well planned disaster recovery program can be tested and proven. It is not designed to merely recover losses as is insurance. It is designed to avoid losses by maintaining the required levels of data processing support regardless of unanticipated events. And beyond this, finely integrated into the normal operating culture of the organization, it results in improved performance, enhanced employee skills, better procedures, more reliable methods and improved operating policies. In fact, even though a well designed disaster recovery program takes only a very small percentage of the data processing operating budget to sustain, companies routinely find that the total cost can be recovered from just the savings resulting from improved performance.
Once disaster recovery is rightfully explained and understood to be a key part of the quality approach to business; that it is in fact a behavior modification, a cultural change rather than a task; that it is MIS’ contribution to assuring continuous, predictable performance; it should receive more than tacit approval. It should and will get management’s enthusiastic support. With that support in hand, MIS can proceed to develop and implement a program that reflects the needs and nature of their organization and one that will assure successful recovery.
Implementing an effective disaster recovery program is not a lengthy nor a particularly complicated process - or at least it shouldn’t be. Assuming the organization decides to utilize external support, which brings with it both experience and the capability of focusing entirely on program implementation, it can be both a relatively painless process and one that results in measurable performance improvements. The options are not complicated and they are fairly easy to evaluate on a cost vs. benefit basis.
Clearly, the best solution is to establish a fully redundant capability. A fully configured and staffed redundant site is, however, most often beyond the bounds of economic reality for most organizations.
So what are the remaining options? One option is to do nothing, which is at least a decision and may, in fact, be a decision that the organization is comfortable with. Companies that are not driven to improve the quality of their entire organization can in fact be comfortable with the notion of being unable to support their customer’s requirements over some period of time. This, however, is an unfathomable position for the vast majority of companies. The next option is to implement a cold site. This is often the most frugal step. Yet, because it has low financial impact, it also results in a prolonged recovery; it is rarely testable and only a very small percentage of the day-to-day disaster recovery gains, performance, experience, policy and procedural gains we discussed earlier can be realized.
And if realized, they tend to be realized to such a small degree that they can rarely be effectively measured.
The most cost effective approach, when measured against quantifiable and qualifiable goals, is the hot site. It is a proactive concept. It is testable. And it results in demonstrable organizational improvement. It is without question the most cost effective and offers the most assured recovery capability over and above all of the other tangible benefits.
How then to chose a disaster recovery vendor/partner? There are several categories of measurement and evaluation that are critical to making the right choice. First is a close look at the potential vendor, including:
- Program philosophy
- Financial depth
- Experience in the disaster recovery industry
From these you can determine if they have demonstrated the will and capacity to meet their promises and your expectations. It is important to confirm that their approach and philosophy matches your requirements.
Next consider the facility(s) offered, including:
- Physical location
- Design, including ownership of the property, buildings and who actually employs the staff
- Operational and technical support available
- Determine the odds that you will/can work with the same people in a familiar site during rehearsals and actual recovery
- Consider the power available including generators and fuel storage
- Utility support including the number of telephone central offices and the number of electric company grids that support the facility
- The security, both protecting your data and your employees
What is not as important is the number of sites and/or systems they offer. What is important is the design of the facility and the capability of their people. Paramount is the ratio of customers to systems/sites, for this is a key predictor of your odds of both testing at a frequency that meets your requirements and of recovering successfully from any unanticipated interruption.
Review carefully their approach to testing/rehearsals:
- What types of rehearsals do they employ
- How difficult is it to secure test time
- How flexible are they in supporting your requirements during tests, the length of the test day, and the technical and operational support available
From these, you can determine their commitment to the rehearsal process, their flexibility, and the resulting probability of successful recovery.
Confirm that their program and the associated costs match your needs. Check their history for their ability to meet their promised support, including:
- Technical and operational people
- Their ability to support both rehearsals and actual recovery
- Their capability to help be optimally productive during tests and actual recovery
From their historical measurements, as well as an occasional reference, you can match promises to reality. Rely more on historical, statistical data than on references. Everyone has a few good references.
The ability to prove the value of their program over time and over a broad variety of customers is critical and a review of their statistical data gathered from their customers over time will provide this for you.
Determine how many of their customers are in your immediate area and consider the impact of this customer concentration on your potential recovery.
During a disaster these concentrations have the effect of creating a competition among customers for their disaster recovery vendors operational and technical support resources.
Unlike user groups, this is one occasion when less is more - due to the impact of increasingly frequent regional disasters such as riots, fires, floods, earthquakes, hurricanes and the like. Viruses are rapidly moving up the list of disaster risks and with the broad movement to more distributed processing; the movement toward more on-line support; more EDI between vendors and customers; and the movement towards open systems, this virus risk will continue to grow dramatically as well.
Consider their ability to help you prepare for enterprise recovery. Can they/will they assist in preparing your recovery plan, maintain the plan and rehearse all the critical aspects of the plan.
Finally, and it should be the final step, carefully look at the associated costs, including:
- The monthly participation fee
- The length of the contractual commitment
- Any associated cancellation fees
- Testing/rehearsal fees
- Declaration fees
- Daily usage fees
- Contingency planning fees
- Technical and operational support fees
- Periodic cost increases
- Travel and lodging costs for you and your people
These costs will not only give you the true measurement of the economic impact, but will serve as an additional barometer of the capability and desire behind their promises.
Selection need not be a lengthy process and implementation need not be a battle of the wills—yours vs. managements.
Disaster recovery, when rightfully viewed as a basic part of the organization’s ability to continue to meet their customers needs on a timely basis regardless of unanticipated events, is an easy, cost-effective, performance enhancing, quality-oriented decision!
Michael Pearce is the Executive Account Manager for Weyerhaeuser Recovery Services.
This article adapted from Vol. 6#1.