
The Test That Wasnt a Test
By Mary Lou Roberts
In the evening of Wednesday, January 15, 1992, Bluebonnet Savings Bank (BSB) in Dallas, Texas, got
to demonstrate first-hand a key DR maxim: a disaster should not be thought of only as an external
event that strikes computer operations. Rather, a disaster is anything that interrupts the continuity of
business operations. And when disaster struck, Bluebonnet was ready.
The 3725 is Down!
That evening, at this multi-billion dollar bank (with 34 branch offices spread around Texas and a
mortgage servicing company in Atlanta), MIS operations came to a halt. An attempt to re-IPL the
banks IBM mainframe failed when the 3725 communications controller would not load. In addition,
operations was experiencing problems with bad tracks on the disk drive.
Like most financial institutions, communication with branches and customers is key to continuing
effective business operations at Bluebonnet. Anything that removes that communications link is
disastrous. We have to be able to allow customers to withdraw money, get information on account
balances, and the like. You just cant tell people that they cant withdraw money because you dont
know how much they have in their accounts, says Chuck Littleton, Disaster Recovery Planner for the
Bank. So it is standard policy for us to declare a disaster on anything that will knock us out for 24
hours or more.
Therefore, when it became obvious that the problem was not going to be fixed immediately, that is
exactly what the bank did. Bluebonnet Savings Bank declared a disaster with their IBM hotsite in
Tampa, Florida, and activated their business contingency plan, automated with Strohl Systems LDRPS
software, at 4:15 p.m. on January 16.
By 8:00 that evening, key bank employees were on a plane to Tampa, and by 12:15 a.m. they had begun
recovery operations. At 6:00 a.m. the Tampa alternative site system was up and running successfully
with all databases loaded.
Back in Dallas, recovery was in progress. By 3:00 a.m. the same morning, the communications
controller had been brought back up. After testing it and solving some communication problems with a
few of the branches, we were able to determine that we could switch operations back to Dallas, and we
did so at 9:00 a.m. In fact, we were only running live at the hot-site for about three hours, says
Littleton. But if the problem in Dallas hadnt been solved, we were ready that Friday morning to be in
full operation in a way that would have been transparent to our branches and customers and in a way
that would have preserved the continuity of business operations.
The role of the Plan
Having the hot-site agreement in place was key to Bluebonnets ability to react and recover quickly. But
just as important, noted BSBs Disaster Recovery Coordinator Patti Smith, was having an automated
business continuity plan that the bank had developed last September.
We realized that in the event of a disaster, there was a lot of information that we would need to assess
quickly, says Smith. Things like the names and phone numbers of people we needed to contact,
organizational plans, task plans, equipment inventories, and the like. That kind of information is critical
to have at your fingertips if you are going to keep doing business and servicing customers.
So last fall, using plan development software, Smith and the unit managers automated the banks
recovery plans. They analyzed the needs and functions of their business units and collected the
information necessary to ensure the continuity of each key business function in the event of a disaster.
It was the availability of this data from the database that allowed us to react so quickly and efficiently,
says Smith.
The real World Test
Because we were actually up and running again by 9:00 a.m. Friday in Dallas, says Littleton, this
experience served as a thorough test of our disaster recovery and business continuity plan. And there
are several key lessons that both Littleton and Smith point to as a result of the experience.
First, says Smith, you absolutely have to have an automated planning tool in order to maintain the
data that is needed to effect the recovery process efficiently. There is simply no way, realistically, that
anyone could control and update that much information in a simple written plan.
Littleton adds, The second lesson we learned is that it is so critical that the data in the database be
current and valid that we will now update our continuity plan on a daily rather than a weekly or monthly
basis. All staff changes, CPU or other equipment configuration changes, etc. will be input to the
database immediately. It has to be current.
Finally, both agree that the position of Disaster Recovery Coordinator, Smiths function, must be made
clear and the lines of communication kept open for all who are in any way involved in the recovery. It
is really important in order to minimize confusion, says Littleton. We had far too many people calling
all over the place to ask questions when they should have been dealing directly with Patti. But weve
cleared that up now. If anything like this ever happens again, everyone knows that Patti is central
control for all information regarding recovery operations. In fact, Bluebonnet Savings Bank now
regards the position as so important that Smith has been assigned an assistant.
Looking Back
And Ahead
Although Dallas was back up and running on Friday morning, the disaster recovery team that had flown
to Tampa stayed on over the weekend to troubleshoot the problems with the modems and
communications lines. They returned on Sunday night, tired, but justifiably proud of a job well done.
This time, the disaster was short-lived. But the experience was an important one. It allowed Bluebonnet
Savings Bank to test and refine, under fire, the value and quality of their contingency plan. If there ever
is a next time, they will be prepared.
Mary Lou Roberts is a free-lance writer and industry consultant with more than 25 years of experience
in information systems.
This article adapted from Vol. 5 #2.
DR World Main Index | Return to DRJ's Homepage
Disaster Recovery Worldİ 1999, and Disaster Recovery Journalİ
1999, are copyrighted by Systems Support, Inc. All rights reserved. Reproduction
in whole or part is prohibited without the express written permission form
Systems Support, Inc.