
Small Leak Teaches BIG LESSON
By David Fettig
At 12:52 a.m., a flood alarm was triggered on the third floor of the Minneapolis Fed as water poured from the ceiling and onto the
banks mainframe computer.
* Within seconds of the third-floor downpour, four employees began covering the mainframe computer and other equipment with
plastic. But the water came fast and the damage was swift: the $3 million mainframe was shut down.
* Located between the third and fourth floors, on the south side of the building, the floods source was a small hole in a pipe that
carries well-water for the banks air
conditioning system. The hole occurred at
an elbow in the two-inch pipe, and
eventually allowed between 1,000 and 2,000
gallons of water to escape. The cause of the
hole is unknown, although the hardness of
the well-water, the velocity of the waters
passage through the piping and the presence
of sand in the water may all be corrosive
factors.
* The water spread over about 80 percent of
the third-floor ceiling, soaking the fiberglass
atop the ceiling panels. When the water
became too much for the fiberglass to hold,
the water poured down through the ceiling in
a torrent and eventually soaked through to the second floor, forcing the personnel department to move to another site within the
bank. Some water also leaked onto the exterior plaza from the second floor.

* By 1:30 a.m. many key officers and staff
people, including the banks computer
technicians, had been notified. As staff
arrived and assessed the damage, it became
clear that the banks contingency computer
site in Culpeper, Va., would have to be
activated. Calls were then made to Culpeper
and to the banks local off-site operations
center, the Postal Data Center in
Bloomington. Within a couple hours of the
incident, six employees arrived at the bank
with their bags packed, ready to board a
chartered plane to Culpeper.
* Usually the third floor is filled with the
perpetual hum and whir of a $3 million
mainframe computer and other electronic
equipment,
but
employees
who
arrived
during the early-morning hours were greeted
by an eerie silence, punctuated only by the
sound of dripping water. Employees likened
the initial flooding to a thunderstorm, which
gradually dwindled to a light spring rain.
* Throughout those early morning hours and
over the next few days, bank officers met
hourly from early in the morning until late in
the evening to resolve problems and update
progress. As the crisis abated, the
management team met less frequently and
schedules returned to normal.
Lesson 1: Standardized, high-tech
operations outperform others that combine
various modes.
Lesson 2: While certain bank operations may be designated as critical in a disaster recovery plan, over time, all operations are
critical, and the important thing is to resume business as usual.
Lesson 3: You are never as prepared as you think you are.

In the exhilarating, yet sobering, aftermath of
the Minneapolis Feds April 8 flood crisis,
those three lessons loom large.
When a water pipe burst above the banks
third-floor computer site on that early
Monday morning, the ensuing downpour set
in motion a disaster recovery operation
unparalleled in the Federal Reserve System.
Within minutes of the deluge, the decision
was made to transfer computer operations
to the Federal Reserve Systems backup site
in Culpeper, Va.; and by 3 a.m., employees
began arriving at the bank with their bags
packed for an early-morning charter flight.
Other employees quickly moved to establish
local off-site operations at the Postal Data
Center in Bloomington.
At the start of business that Monday, just a
few hours after the banks computer
mainframe was deemed inoperable, about 50
employees were stationed at the Postal Data
Center, six were in Culpeper loading the
banks software into the back-up computer, and other departments affected by the flood were restationed throughout the bank. By
noon that same day, 10 hours ahead of the disaster-plan schedule, electronic wire service was fully restored.
And that was the simple part.
The hard work of disaster recovery for example, establishing efficient communication links with financial institutions and
addressing the pressing needs of non-critical bank functions would be completed in the ensuing days.
High-Tech Gets High Marks
Getting the mainframe up and running according to plan was probably the easiest part of the recovery, says Colleen Strand,
Minneapolis Fed chief financial officer and senior vice president in charge of disaster recovery. And, in the beginning, it was also the
most important part. In this age of electronics, when the financial services industry is increasingly reliant on electronic data
processing, a Fed banks computer mainframe is the heart of the institution.
Every day the Minneapolis Fed moves about $10 billion electronically through its wire transfer and automated clearing house (ACH)
system, which enables companies to validate transactions, and allows automatic deposits and bill-paying for consumers. Also,
banks use electronic services to manage their reserve funds and to balance their daily interbank accounts.
As it happened, the timing of the flood could not have been much better. During the early morning hours of Monday, the mainframe
computer completes ACH work from the previous Friday and has not yet begun the heavy load of a new business day. Had the
accident occurred at 2 p.m. Monday, for example, there would have been more problems, according to Thomas Kleinschmit,
assistant vice president for Electronic Payments and Network Services.
Still, those ACH files from April 5th that were in the hopper when the computer shut down posed particular problems that took
about a week to remedy. Some files were lost because there was no time to back up the transactions; ACH staff then had to
methodically replay the transactions from April 5th with each financial institution to ensure that every item was properly accounted
for.
And, as Kleinschmit says: ACH is a
delicate balance in normal times, let alone
when something like this happens. ACH is
complex, very complex. For example, each
ACH file contains 1,000 payment
transactions that must be individually
processed; the Minneapolis Fed handles
about 5.5 million such items every month.
ACH is really resource intensive
computer intensive, Kleinschmit says, a
point that was greatly emphasized during the
flood recovery.
And that reliance on computer technology
goes beyond the Minneapolis Fed and
extends to the financial institutions that use
the banks electronic network services.
While some institutions receive their financial
information from the Fed on paper copies or
magnetic tape, many rely on electronic
services. Aside from the lost files of April 5,
the maintenance of computer links with
financial institutions proved to be the most
enduring problem during those first days of the recovery.
In order for some financial institutions to use the Feds wire transfer service during the initial stage of the recovery, they had to dial
in on specially leased phone lines that allowed just one caller at a time; this meant that institutions had to form daily queues.
Also, according to Strand, some transmission troubles existed because of the Feds policy of allowing financial institutions to use a
variety of computer and peripheral equipment, like modems and printers, to communicate with the Fed. That meant that the Fed had
to scramble to establish special links with individual institutions, which was a time-consuming proposition for the already
over-worked technical staff. In other words: those financial institutions with the most up-to-date and standard equipment fared much
better during the recovery period.
Some Federal Reserve banks only allow their financial institutions to use one standard set of equipment, and they require those
institutions to test the equipment on a regular basis, according to Susan Mendesh-Fitzgerald, Disaster Recovery Planning Manager.
Those who dont have the optimal equipment and who dont test are given a low priority in the disaster recovery plan.
Weve never been that aggressive here, Strand says. We want to please our customers and they dont want to have to buy new
equipment and new technology, so weve accommodated them. In a disaster situation, however, you find out that that policy can
cause problems. As long as you dont have a disaster, your customers are happy.
Strand says the current policy may be reevaluated as the bank reviews the recent events and fine-tunes its disaster recovery plans.
She is also quick to say that not all financial institutions experienced problems during the recovery period. In fact, with the fast
start-up time of the Culpeper mainframe and the efficient links with some institutions, Strand says that many institutions experienced
no disruption of service and only became aware of the flood after they were directly informed by the Fed.
An Intricate Web Of Computer Reliance

While the emphasis of disaster recovery has
traditionally been on the immediate
resumption of the data center and the critical
functions like electronic network services
and certain accounting areas, the current
crisis stressed the need to also prepare for
the resumption of other bank functions.
When the main computer of a large
corporation goes down, there are many jobs
within the bank that are affected. For
example, data for certain research
publications was unavailable during the initial
stage
of
recovery,
and
the
banks
supervision
department
had
to
borrow
the
computer
resources
of
another
Fed
bank
in
order establish linkage with the computers of the Federal Reserve Board in Washington, D.C.
Ironically enough, at the time of the flood, the bank had just begun work on a more comprehensive disaster recovery plan. As
Strand explains, there is a distinct difference between recovering a data center and recovering the ability to conduct day-to-day
business. The difficult thing about disaster recovery is resuming your business, making sure your customers are connected, that
information is flowing, that what are considered non-critical functions are up and running. Thats where disaster recovery literature is
beginning to focus on resuming business.
Strand says the Minneapolis Fed quickly realized the need for an adequate business recovery plan. For example, while the Culpeper
mainframe was quickly engaged on the first day, by the second day some financial institutions were calling for records of their
recent transactions, and those records werent immediately available; by the fifth day, those institutions were still calling.
Stresses The Need For
Thorough Planning
Last year the Minneapolis Fed distributed a booklet to its customers, or financial institutions that use Fed services, that spelled out
the steps they should take in the event of an emergency that disrupted connections to the Feds critical electronic services.
Following the flood of April 8th, very few institutions used the booklet as a resource or had any sort of plan to deal with such a
contingency. In the early days of the recovery the resultant phone calls from confused financial institutions swamped much of the
Fed staff.
In review, Strand says, the Feds disaster preparedness planning fell short: What we had tested, to be very blunt about it, is
whether we could get the data center up. What we hadnt tested was whether we could resume and sustain operations in a disaster
mode for several days. We took our testing seriously but we now know that it never went far enough. Of course, you can test all
you want and people still run on instinct much of the time. As it happened, we did well despite the limited testing.
Doug Fleming, vice president at the Kansas City Fed and district recovery manager for the Federal Reserve System, agrees that the
bank did well. All in all, my assessment is very positive, he says. They handled the recovery very well.
Every Federal Reserve District bank has its own recovery plan, Fleming says, but each plan is approved by a Systemwide group of
first vice presidents.
And even though each bank has its own plan, the entire system will learn from Minneapolis experience. Since the Federal Reserve
System converted its computer site in Culpeper, Va., into a back-up system in 1984 (it formally housed a system communications
network), the Minneapolis Fed is the first bank to use its services on anything but a test basis.
Fleming says that officials from all Fed District banks have been taking notes on Minneapolis situation and that, eventually, the
flood crisis will serve as a learning model for the entire system. We can practice, test and plan all we want, but when we actually
have to use a recovery plan we learn the most, he says.
Staff Is Everything

From the moment the first gush of water poured through
the third floor ceiling and four computer workers
responded by quickly covering the equipment with
plastic, until weeks later when the last question from a
perplexed customer was answered, employee response to
the flood has been critical to the banks recovery, bank
officials say.
Staff is everything, Kleinschmit says. You may have all
the computers in the world, but your staff is your major
card. And this staff rose to the occasion.
Employees in many departments worked extended hours
during the first few weeks of recovery (as many as 18 to
20 hours per day during the initial stages), and they got
together to help solve child-care and transportation
problems. Also, employees who were not directly
impacted by the crisis got involved by volunteering to
make phone calls to Ninth District financial institutions to
provide periodic updates.
All of which, perhaps, suggests a final important lesson:
Lesson No. 4: People are the key.
David Fettig, The Region, Federal Reserve Bank of Minneapolis. Reprinted from Region Magazine.
This article adapted from Vol. 4 No. 3, p. 8.
DR World Main Index | Return to DRJ's Homepage
Disaster Recovery Worldİ 1999, and Disaster Recovery Journalİ
1999, are copyrighted by Systems Support, Inc. All rights reserved. Reproduction
in whole or part is prohibited without the express written permission form
Systems Support, Inc.