The limiting factor in the usefulness of a disaster recovery plan is speed; maximizing the speed of backup procedures and minimizing data restoration time is the name of the game. The cost benefits of optimizing these parameters are really incalculable since the whole purpose of the plan is to allow business critical functions to continue with as short an interruption as possible – despite operation-disrupting events.
The importance of fast VM system restoration was brought home to Dow Corning recently during a hotsite test when the company tried, but failed, to complete a partial system restoration of the equivalent of 120 single volumes of data in a 32-hour window. Had the restoration not been under test conditions, a wide variety of business critical business operations running on the VM system would simply not have been available for an extended period of time. The result could have been severe.
Dow Corning installed its VM system in 1983, and since that time applications and users have steadily increased. The system now runs on an IBM 3090-400 and supports 6000 users worldwide.
The hotsite test underscored the firms dependence on the VM systems as well as the need for an improved backup and restore system. Company officials responded by implementing Syncsort/BACKUP (Syback).
Officials initially obtained the product on a trial basis just one month before the next hotsite test. Yet even with that restricted time period, officials were able to shorten the restoration process to just three hours and 25 minutes – an improvement of almost 1000 percent, due primarily to an enormous improvement in physical restoration time. Where the previous product required 3.6 hours to physically restore the 11 volumes , the company needed to IPL the basic system. The new system completed the same process two and one-half times faster – in just 1.5 hours.
Because it took so long to physically restore files with the former system, the company depended on logical backups. But logical restores are much slower to complete than physical restores – even with the fastest system – because they are done on a minidisk by mindisk basis. To restore a system from logical backups requires restoring minidisks individually by manually entering separate restore commands for each minidisk from a hard copy file listing. Physical restores, by comparison, are done cylinder by cylinder and require no file catalogue.
To restore all data to the point of a disaster (daily sync-point) with the former product would have taken weeks – and would have cost the company dearly. The goal of the new business continuation plan (BCP) is complete restoration to the daily sync-point in just 16 hours.
Company officials are confident of achieving this goal because of the speed with which the new system completes physical restores as well as its streamlined logical backup and restore process. In a benchmark test, the former product took 25 minutes to logically back up a user volume comprised of 200-300 minidisks. The new system, by comparison, took half that time.
This fast logical backup time enables the company to improve its daily backup procedures as well. In the past, each daily incremental backup only included data created since the previous daily incremental. While this saved backup time, it greatly slowed restoration because as many as five tapes may be required to complete a daily sync-point restoration.
To further optimize restoration speed, the new system supports up to 10 tape drives simultaneously and features non-specific tape mounts. With an automatic tape loader – each of which support 10 tapes – on each of our 10 tape drives, this means that restoring all 220 tapes used for weekly physical backups only requires that the operator make sure the loaders are filled with tapes.
The cost benefits of optimizing backup and restore procedures are incalculable. Virtually all companies depend on their data resources and should implement business continuation, or disaster recovery plans.
Stanley G. Pope is VM system programmer with Dow Corning Corporation in Midland, Michigan.
This article adapted from Vol. 6 #2.