It is 10 P.M...Do you Know Where Your Backup Tapes Are?
- Published on October 28, 2007
In spite of the many new advances in technology and the associated cost refections that have made electronic vaulting more affordable than ever, many companies still rely on physical backup tapes as the backbone of their recovery plan. Critical data, deemed vital for the recover of a business, must be current, complete, accessible and transportable in older to assure a timely, successful recovery. Among the lessons gleaned from the Northridge Earthquake is that this string of dependencies is fragile at best and easily broken, especially under the circumstances present in Los Angeles in January, 1994.
That the data must be current is a given. That companies pay sufficient attention to the backup process on a regular basis is not. Backups are routinely taken and diligent tape librarians dutifully package the files daily and ship them off to a vault. As companies change or grow and those same changes are introduced into the information systems, the corresponding changes are not always immediately incorporated into the backup process. As a result, backups are not always immediately made, or if they are, not always immediately sent off site with the other "vital records." Without regular inspections, audits or recovery drills (a.k.a. exercises or tests) these deficiencies usually go undetected until the records are recalled for a live recovery. By that time, however, its far too late for a remedy. At best, the timetable of the recovery is blown because the delinquent backup files have been destroyed in the same event that caused the recovery. While the worst case did not occur in the Northridge Quake, it was due more to providence than good planning.
A new wrinkle in this backup/recovery scenario is the frequent use of remote tape mounting robots and storage systems. A few companies have begun to capitalize on the efficiencies of these devices by directing all of their backup files to a separate, remote (from the home site) automated tape mounting system. The theory being, in a disaster, one need simply empty the contents of the remote storage system and send all the files to the recovery center. In practice, as witnessed during the recent recoveries, this process is more susceptible to error than the old manual process. The reason for this is that there is no oversight in the automated process and without very strict change controls, exits or Job Control Language might be changed to inadvertently route some backup tapes to the wrong storage device. This all happens out of sight and with the speed of light and is particularly difficult to uncover until the actual files are needed in earnest. And all it takes are a few misdirected backup files to ruin an otherwise flawless recovery.
A final lesson regarding companies in earthquake prone areas is the accessibility and transportability of the backup files. The piercing eye of the news camera brought the stark reality of the devastation destruction of Northridge into every living room. It was clear for all to see that the infrastructure of a large segment of the city was severely crippled. Roads and highways, power and water, communications and airports were all curtailed if not halted. This is the environment that planners must continue to envision when they develop a contingency plan, select a location for the storage of their vital records and envision how those records will be transported to the recovery center. Once again, this event pointed out the deficiencies in some of the plans. Vital records vaulters were selected whose own facilities were affected by the event. Communications to vaulting locations are extremely difficult, if not impossible, in the hours immediately after the quake. The inability to provide prompt direction to vaulters delayed the shipment of some sets of vital record from the stricken area.
Alternatives to this serious exposure are many and varied and can easily be implemented, although some may increase the ongoing cost of backing up data. Choosing a vital records repository that is a considerable distance (at least 50 miles) from the data center is one way to assure that the disaster that cripples the center does not also affect the backup files. Even better, choosing a vendor in close proximity to the recovery site not only assures that the same disaster won't strike both but also that the time to retrieve the vital records and send them to the hot site is absolutely minimal. The trade-offs in cost versus benefit, however, must be made for each individual case.
John Nevola is a manager for IBM Business Recovery Services Center.