|
EXECUTIVE
COUNCIL
Business
Continuity Chronicles
By Jim Hammill, CBCP
EDITOR’S NOTE: This is the second of a seven-part series
featuring the members of our executive council. Through these personal
accounts, we hope to not only highlight their careers, but also give
a seven-sided view of the history of the disaster recovery/business
continuity industry.
Year end, 1986. The year I started a new career direction in the disaster
recovery/business resumption world. The year I realized the number “seven”
wasn’t necessarily a lucky number, but it sure was going to play
a key role in my life. Now it’s close to 20 years later, and things
couldn’t be the same. Or could they?
In 1986, my overarching responsibility was to manage seven different
groups of people who were responsible for seven different functions.
All of these functions, for the most part, ran 24 hours a day, seven
days a week: facilities planning for four major data centers with 37
main frames, five mid-range data centers with 28 machines, and 16 remote
job entry stations dispersed throughout the business office complex
with many different mini-processors and output devices; vendor coordination
operations; help desk operations; graphics microfiche operation; problem/change
management group; and a data center scheduling group.
And the Story Begins
There I was, working diligently on my computer – black 9 on the
red 10, ace of clubs up – when the phone rang.
“Hello?”
“Hey, Jim,” said Rob, my old data center boss. “How’s
things going for you today?”
“I’m doing fine,” I said. “An interesting day
so far. Payroll blew up last night. Of course, no one knows why, but
once we backed out two new ‘untested mods’ that were magically
added from the development staging library, everything ran fine. I was
assured by the programmer who did it, that it wasn’t the problem.”
We chuckled and I went on, “This morning the building engineer
informed me that we have a paper dust and toner problem in the printer
room. The town engineer claims that is a potentially explosive situation
– I’m not kidding! He said we could have an explosion in
the room! Let’s see, what else happened ... there was a fist fight
between two women in the data center last night. At the same time, one
of the guys wasn’t feeling well, and couldn’t make it to
the men’s room, so, as he put it, ‘I didn’t want to
mess up the nice white floor, so I pulled up one of the floor tiles
and threw up under the raised floor.’ His vomit covered the cable
connectors to the data switch, and the vendor refuses to clean it.”
Without taking a breath, I continued, “I have 17 open trouble
tickets, and two main frames are down because of critical device end
missing errors. I have three hardware vendors pointing fingers at the
others with no resolution in sight. As a joke, one of my employees put
a note on the status board that said the main frame failures were because
the vendor didn’t replenish the CPU’s ‘dylithium crystals’
during the last maintenance cycle.”
At that point, Rob is laughing hysterically. I say, “Oh, it gets
better. Believe it or not, one of the application managers took it upon
himself to escalate the dylithium crystal problem to IBM. I heard they
were still laughing when he hung up. Now that manager wants me to suspend
the guy. So, all in all, a normal day.”
We both laughed, then he said seriously, “Jim, I have a problem
and I need a guy like you to help me out of a jam. The company has a
little outstanding audit issue that needs to be addressed. You helped
me build these data centers, now I need someone like you to put a plan
in place to protect them.
“It’s going to be a small group with high visibility, and
I want you to lead it. You can hand pick your staff; you get to direct
everything from policy to strategy to implementation. I’ll support
you all the way, whatever you need. What do you say?”
I thought long and hard, and seven things immediately came to mind:
my current range of responsibility; the weekly fist fights; the four
union complaints against two of my supervisors; the hardship of being
on call 24 hours a day, seven days a week and getting called every single
night for the last four years; the garbage cans filled with empty aspirin,
Mylanta and Alka Seltzer bottles; coming in on different shifts to assure
everyone that they were part of the team; and keeping peace between
the hardware vendor engineers who hated each other.
Yes, it took about seven seconds to reply, “Rob, if you can spring
me I’m yours.”
“Great,” he said, “I already cleared this with the
two directors. You start on the first of the month. Think about who
you want on your staff and get busy with the transfers.”
It’s funny how we sometimes don’t pick up on key phases
during a conversation. You know, phases like, “a little audit
problem,” “you get to lead a new group,” “high
visibility,” and “responsible for policy, strategy, and
implementation.”
I found myself heading up the new data center disaster recovery group,
which oddly enough, was not part of the data center. Ah yes, the good
life, a straight-forward process with one goal … “address
the outstanding audit.” A stand-alone group of seven people reporting
to me with the authority to create policy, set strategy, select tools,
and provide direction to the data centers for plan development and implementation.
Oh … Some of the Other Things That Popped Up
The company had a little merger going on. The four major data centers
grew to 19 data centers. Then the 19 sites needed to be consolidated
to, what else? Seven sites. And now, there were seven data center directors
involved, all at a higher level than my boss and, of course, none of
them liked him. We were seven months into the planning when my boss
was reassigned to another critical project. We were both pretty disappointed
and I didn’t feel real comfortable with the change in management.
And Remember, Include All the Players!
... Well, Maybe Not!
I told the data center directors early on that this was turning into
a “data center only” solution, that it wasn’t practical
to create a recovery plan without input from the application owners.
As they looked at me in a combination of disbelief and horror, I was
told, “Jim, let’s not go upset our customers. After all
it’s not going to do any of us any good to tell them we might
not process their applications if we are running in a degraded recovery
mode. Let’s not include them, for now, and see what we come up
with.”
I was feeling pretty uncomfortable with the use of the word, “we.”
I knew “we” meant “me.” And the more they said
“we,” the harder I’d swallow. But I knew I’d
still get to set policy and select the tools, and my staff would be
considered the overall experts. Pretty straight forward, or so I thought.
The Solution ... or Something Like it
The seven data center directors decided that since this was a new endeavor,
it would be appropriate to trial the new “matrix management approach.”
As it was told to me, “Jim, you’ll sit at the top, setting
the strategy. Your team will assist our teams in plan development. Our
teams will consist of data center operation; teleprocessing; system
support, one each for MVS, VM and Univa; and application support. That
will be a total of seven teams in each of seven data centers that you’ll
matrix manage and control the process. Of course, you won’t have
direct appraisal input into our team’s performance … just
the seven people who report to you. But being as we are now starting
our new upward feedback for the new management appraisal system, you
will have seven subordinate groups rating your performance.”
I thought, how nice!
Success! … Sort of
By 1989, during the company’s and the data center’s continued
downsizing, a new executive director position was added to the management
layer. Each of the seven data center operations directors brought their
groups in to see him one by one to explain to him what they did and
how they did it. He, in turn, decided whether or not their function
was needed in the new environment. Of course, all the operations groups
got to go first. And of course, each group complained that they were
being forced to plan for a data center recovery with a group that was
in fact outside the control of data center operations. So, by way of
“matrix management” magic, we were granted an audience at
which time we were to defend (I mean present) what it was we did for
a living.
Our presentation was flawless. Not one stumble, not one misspelled word
during the four-hour presentation, not one unanswered question on policy,
strategy or plan development. Not one. Well, one. The executive director
said, “You know, I’ve listened to you folks for almost four
hours without interruption. And I understand you’ve successfully
tested the individual components and operating systems.
“My question is, and all the directors agree with me on this point,
why the hell haven’t you included the application owners in this
process? We think it was a mistake for you to exclude them.”
I thought, “There’s that ‘we’ word again. And
he’s using my argument against me! Now doesn’t that beat
the band?”
He leaned back in his chair, put his feet up on the table and lit a
cigar. There was complete silence in the room, he took a long look at
us and said, “Boys, I don’t think your plan is going to
work. I’ll get back to you in a few days. Y’all can go now.”
At noon, three days later my new boss called.
“Jim,” he said, “The executive director is testing
our capabilities and just declared a hypothetical data center failure.
Implement ‘your’ disaster recovery plan to address this
scenario. My first thought was, ‘I’ have to do this? What
happened to ‘we?’”
By chance the first surprise exercise would be a New Jersey to Florida
move. I was also told, in no uncertain terms, that the exercise would
not be finished until the recovery machine was successfully running
in the new data center, which meant, “Don’t plan on coming
home if it fails.”
To our advantage, the executive director chose a huge application which
used an entire main frame. Consequently, the system and application
backups were all in sync. We quickly got all 1,600 vaulted tapes pulled
and delivered to the airport and freight-shipped on a separate plane.
More than 20 employees were ticketed and told to report to the airport.
They did not have time to go home; clothes and hygiene items were to
be purchased on arrival.
So What Happened?
Because the test was initiated at lunchtime, some key people were out
of the data center. Also, the data center manager, librarian, DASD manager,
and master console operator were declared “dead” because
they re-entered the data center for some documentation. They were not
allowed to participate or talk to anyone during the test. However,
• Sixteen-hundred tapes were pulled by an outside vendor and delivered
to the airport;
• Arrangements were made to freight materials;
• More than 20 passenger tickets and hotel rooms procured;
• Personnel were assigned to shifts for the recovery operation;
• A command post was set up;
• Bridge numbers were activated and status meeting times established;
• The work load of a test machine in Florida and 20 strings of
DASD were dumped;
• New cabling was run from the recovery machine room to the command
center to control the recovery process, which included punching holes
through existing walls.
The Results?
• SUCCESS. The systems and application were up and running in
under 70 hours.
• What did I learn? The one most important thing was that my core
group of seven mangers was recognized as the one place where all the
“piece parts” of data flows and disparate applications,
networks, software, and hardware came together. My managers became the
“experts” on where, why, and how the process linkages worked.
Because they directed and ran the planning meetings they became intimately
aware of the “big picture” interdependencies.
• The word went out, “Hey, these folks know what they’re
doing!”
Casualties:
• Seven people left operations and one divorce occurred.
Next Chapter
Soon, I was back at my desk, hard at work on the PC … deuce of
clubs up, red 4 on the black 5 ... and my phone rang.
“Hello?”
“Hey, Jim,” my boss said. “We did a great job on that
recovery, didn’t we.”
I thought, “Oh boy, back to ‘we’ again.”
He said, “Listen to this – I just got back from an executive
briefing – seems the auditors are really nailing the application
groups for not having application-specific recovery procedures and policies
in place. The board of directors and the executive committee want resolution
on this and someone to head it up from an enterprise-wide perspective.
I know what you did here was an uphill battle. I know you made a lot
of enemies, but you stuck to your guns and did the right thing. I know
you want to move out of the disaster recovery field but it seems the
CFO and corporate controller have been reading your presentations and
materials regarding business continuity and the need to address recovery
on an enterprise-wide basis.
“So their idea is to move you under them and get corporate-wide
planning going. They don’t have anything in place, you can work
this issue from the other side. You’ll be in control of all the
things you couldn’t do from the data center end. You know, what
you always said was needed.”
I met with the controller the following week. I was reminded that this
would be a great career opportunity. I would have full control of setting
policy and strategy. I would report directly to the corporate finance
vice president so there was no perceived allegiance to any one line
of business in the company. The project would be fully funded, and I
could pick my own staff.
So there I was – the data center operations guy who once worked
in the print and tape pools – sitting at corporate headquarters
with the controller and the CFO of a muti-billion dollar corporation.
Executive row! How could I refuse?
I said, “Of course I’ll take the job. I’ll let my
boss know and get moving on the paperwork.”
He said, “Jim, we knew we could count on you. Don’t worry
about the paperwork. It will be in motion before you get back to your
office.
“Just one thing, Jim. While you’ll be used as a resource
for all of the company’s lines of business, I’ll want you
to separately lead the business resumption process for my financial
organization. Besides, if I’m paying the freight to protect the
corporate assets, I think it’s in my best interest to have you
direct my department’s business resumption activity. I want you
to meet with the finance application directors.
“We have 483 financial applications that control the corporation.
We’ve divided those applications into separate discrete families,
each headed by a director. We’ll plan our calendars so I can introduce
each one to you, and you can get the job done.”
I said, “Not a problem, how many individual meetings are required?”
He said, “Seven.”
I looked across the desk at him with a blank stare. He asked, “Is
there a problem?”
I said, “You know, strangely enough, I had to deal with seven
directors on the data center side and it wasn’t easy. It’s
career affecting, if you know what I mean.”
He said, “Yes, I know. You weren’t liked, but you got the
job done. Ironic isn’t it, you have to force people to do the
right thing and you’re disliked by the same organizations you
helped. You were in a thankless position. Unfortunately Jim, there’s
a good chance you won’t be liked here either. But look at it this
way … at least you’re used to the feeling.”
What’s the Lesson?
With all of the advancements in technology that have affected the business
resumption arena – from centralization, to de-centralization,
to miniaturization – the story stays much the same today as it
was in 1986, 1989, and 1996. Business continuity has yet to be institutionalized
in corporate America. The names have changed, but the stories and frustration
of getting executive row to truly support contingency planning still
seems to be a roadblock. Technology will not solve the “problem.”
Technology solves “issues” but creates a “new set
of problems” to be managed. It’s still an uphill battle,
but keep up the good work – the company you save may be your own!
In closing, it’s important to remember you need to be part of
the solution. I tried to take a lighthearted slant in this article but
there were a lot of disappointments and arguments and career affecting
moments. I stayed active in the company and in the industry and ultimately
got the support I needed. My career started years ago walking the “halls
of the operations” which lead to the “halls of corporate
headquarters and executive row” and ultimately even got to walk
the “halls of the White House” as a special advisor to one
of the president’s cabinet members.
Not bad for an old data center disaster recovery planner!
James Hammill is an independent consultant and has been an active participant
in business continuity for 18 years. He has been an advisor to private
sector CIOs, Federal Emergency Management Agency, and a member of the
Natural Hazards Caucus Committee, advising 18 U.S. senators to widen the
understanding in Congress to risk and cost reduction for natural and man-made
disasters. Hammill served the Disaster Recovery Journal Editorial Advisory
Board for many years and is now a member of the DRJ Executive Council.
©Copyright
2004 Systems Support Inc. All rights reserved. Reproduction in whole
or in part in any form or medium without the express written permission
of System Support Inc. is prohibited.
«BACK
to the Articles Index
|