When I first took up skiing, the instructor drilled us again and again on the proper way to fall. By the end of my first season, I had the theory and plan of action down pat.
The problem was that every time I got into trouble, I rarely fell according to “the plan.”
This type of reality versus theory conflict is even more frustrating when it comes to a data center’s MVS system going down. Although few companies will ever experience a major disaster, it’s like skiing--you need a recovery plan just in case.
A novice who grabs a pair of skis and heads for the expert slopes is considered a fool. Most people get advice and assistance so they have the proper equipment. Then, they take lessons and continually practice to hone their skills and their disaster (spill) recovery.
Not every spill that a skier encounters will be according to plan, but having a plan can prevent injuries. The same is true of data center disasters. Any crash you can walk away from is a good recovery.
The Right Tools for the Job
The first thing you need for a sound DRP is the proper equipment, beginning with a good team. Since DRPs are time-consuming and cumbersome to develop, organizations usually assign the task to staff members who are available for extended periods of time...whether they are qualified for the task or not. But it pays to wait, because the right people will more quickly produce a better plan that will generally be less dependent on the “critical” staff.
One thing you don’t need is excessive documentation. Reams and reams of written documentation won’t guarantee that you will have the information you need when you need it most. DRPs should be designed to handle an unpredictable event. But by their very nature, unpredicted events cannot be prepared for, no matter how much documentation you have. A short, concise DRP that can be readily understood is more effective than one that covers “every” possibility.
Another problem with DRP documentation is that by necessity, it’s written ahead of time. It is a projection from historical records of what you think the situation will be when disaster strikes. But what if things do not take place exactly as predicted? You need a way to look at things as they exist in realtime--i.e., a one-pack backup system. Although slow and subject to the same problems as the primary system, it’s better than spending the time and money traveling to your hot-site, hoping that the system will work in sync with your primary system.
Get in Sync
Systems can get out of sync for a variety of reasons, some intentional and some unintentional.
Although the technical staff will generally make all of the changes perfectly on the primary systems, they may not remember (or even be aware) that the DRP system also needs updating.
Or, perhaps each week, you alternate between updating the multiple systems...actually forcing your systems out of sync.
The fact is, since most out of sync situations are oversights, you can be sure that they won’t be documented; there is no easy way to prevent these problems.
Your one-pack or starter system is a reasonable alternative, but what’s really needed is a simple tool that allows you to quickly determine what the status is. One allows you to make minor corrections with ease
Under Lock and Key
The security of your system and your data is vital--it cannot be compromised. But when a disaster strikes, you need to be certain that your security system won’t keep you from getting the system up.
The ideal solution gives you a contingency that you can use in a dire situation, but one that can be kept under “lock and key” so it cannot be misused.
While dire situations rarely occur, any downtime is costly, so plan for it and have tools available for your people to bring the system back up quickly.
Then, develop a DRP that’s simple, include the right tools and hold tight to your polls.
Paul Robichaux is Chairman of the Board for NewEra Software, Inc.
This article adapted from Vol. 4 No. 1, p. 18.