Fall World 2014

Conference & Exhibit

Attend The #1 BC/DR Event!

Summer Journal

Volume 27, Issue 3

Full Contents Now Available!

October 26, 2007

Hoping for the Best, Preparing for the Worst

Written by  David W. Stacy, CDRP and Piotrek Stamieszkin
Rate this item
(0 votes)

During the last several years, great strides have been made in the development of Business Resumption Plans (BRPs) in many organizations. However, having paper plans is not enough! In our contacts with many business resumption planners in other organizations, it became obvious that most of them did not realistically exercise, or 'rehearse,' the majority of their plans. This is especially true of many Incident Management Plans (IMPs) which usually involve the most senior management in the organization. The only notable exception we found was that most companies with advanced contingency plans invest substantial resources in tests of the recovery of their computer centers using a variety of third-party 'hot sites.'

Our contention is that the senior managers who, in the event of a serious incident, will have to deal with a multitude of critical, out-of-the-ordinary decisions need a similar opportunity to rehearse their performance. Only with such practice can these people prepare to effectively participate on their organization's Incident Management Team (IMT). On the other hand, not providing them with this learning opportunity is equivalent to teaching pilots to fly planes by having them only read books and watch instructional films. Would any of you like to be on a plane piloted by such an aviator?

At my company, Union Life Insurance Company of America (UNUM), we decided at the very beginning of our BRP development process that we would use realistic rehearsals to exercise our plans and prepare our recovery team members to deal with a variety of situations. Also, we wanted to use these rehearsals more as learning experiences rather than as tests. The word 'test' is emotionally charged and implies that you either 'pass' or 'fail.' To the contrary, this whole process is about learning.

As soon as we finished the first draft of our Incident Management Plan, we held a two-hour 'walkthrough' based on a simple fire scenario. The entire Incident Management Team had to prepare responses to different 'facts' that we introduced. While still largely a 'paper' rehearsal, it required the IMT to use the plan and develop a list of actions with time frames attached to them. The entire IMT reviewed these actions and coordinated the response to the fire scenario. During this rehearsal, a few flaws in our plans surfaced:

  • overlapping responsibilities among team members;
  • not involving appropriate experts in decision making;
  • the need to develop a stronger team.

This resulted in a number of changes that were incorporated into a second draft of the IMP.

Before we rehearsed the new and improved version of the IMP, an unannounced 'after hours' team activation exercise was conducted to measure how long it would take us to assemble our core response team. The exercise was triggered by one of our auditors so that all members of the IMT, or their alternates, could meaningfully participate. Audit measured our performance and the results were encouraging: within 90 minutes from the initial phone call to Security reporting the 'incident,' we could assemble almost the entire team at one of our pre-defined Emergency Operations Centers (EOC). Now IMT members wanted an opportunity to further enhance their preparedness and exercise the improved plan they had developed. This lead us to the design of a dynamic simulation exercise.

Enter dynamic simulation approach. Certainly the first question is: What do we mean by a dynamic simulation? We defined it as a simulation exercise where successive events would develop differently, depending on decisions and actions taken preceding them. This means that the outcomes of such a simulation are very useful in assessing the IMT's decisions - it is possible to make 'bad' as well as 'good' decisions using this approach without 'ruining' the scenario. For example, the timing of the decision to evacuate a building would affect the number of injured when the 'incident' occurred. Or, the way in which an inquiring next-of-kin was treated would affect the content of a subsequent news account of the 'incident.'

In our simulation, the members of the IMT did not have any scenario information prior to the rehearsal. This was perhaps the most difficult element in selling this type of exercise to senior managers. Initially, they wanted to know what would happen and what their response should be. We convinced them to prepare for the exercise by simply becoming more familiar with the IMP and to draw upon their knowledge and experience to react to the situation confronting them.

A lot of preparation went into organizing this rehearsal. In order to coordinate all incoming phone calls, we set up a 'simulation center' adjacent to the EOC and staffed it with three people who directed and timed all key events. An observer in the EOC kept the simulation center informed about what was happening. In some instances, adjustments were made to the timing of events in order to speed up or slow down the simulation. All of the actors recruited to participate in the exercise telephoned the simulation center at a pre-designated time and were forwarded to the EOC to play their roles. Similarly, all phone calls out of the EOC went to the simulation center where they were logged and notes taken on the IMT's 'decisions.'

The Emergency Operations Center was equipped as it would be in a real emergency. However, since the simulation, we have further conditioned the facility based on some additional needs identified during the exercise.

Our scenario called for an 'incident' involving a roof collapse due to a very substantial accumulation of snow. It is noteworthy that our rehearsal was held in mid-December 1995 on the day following the first snow storm of the season. Unfortunately, in early 1996 this scenario became a reality at many locations along the Eastern seaboard. One has to be careful in choosing scenarios!

Location of the 'incident'

The affected building houses approximately 600 people and, at the time of the simulation, the largest of UNUM's two corporate data centers. It is part of UNUM's home office campus, which includes another large office building and a day care center, in Portland, Maine. Due to the size of the building and the impact on our data processing operations, any incident rendering this building inoperable would be deemed very serious.

Orienting the IMT members

While the simulation exercise was scheduled well in advance, no information was available to team members until the morning of the rehearsal. All participants, 19 members of the IMT representing key areas and a number of observers, received an audio cassette to listen to on their way to work. The recording consisted of a notification call and a segment of a 'radio broadcast' from a local FM station that introduced time and some basic conditions for the simulation. The time was early morning and when people entered the Emergency Operations Center, the clocks showed 6:00 a.m. (it was actually 8:30 a.m.). Also, the news segment of the radio broadcast provided information about two storms that had dumped 25 inches of snow on Portland during the past week and another one on its way up the coast that day.

Activating the IMT

The reason for activating the IMT was concern about the structural integrity of the roof due to the heavy snow accumulation. This had first been brought to the attention of Security by the third shift data center operations team who reported strange sounds coming from overhead. This situation posed a potential threat to the safety of our employees as well as the possibility of a business interruption of greater than one day - two of the four criteria we use to decide whether to activate our IMT. Under our plan, the decision to activate is made by our Incident Manager (IM) who is a senior vice president responsible for our facilities and information systems division. The IM relies heavily on our manager of security and director of BRP for information needed to make this decision. Once a decision is made to activate, these three people use a calling tree to mobilize the rest of the IMT and to inform our senior executives, who are not on the IMT.

When IMT members assembled at the Emergency Operations Center, they were briefed on the roof situation by the manager of security and the manager of facility engineering, both of whom had inspected the site earlier. Following the briefing, the manager of facility engineering left the room to 'return to the building site' to meet a structural engineer who had been called in prior to the IMT activation. In actuality, he went to the simulation center. Then, the director of BRP briefed the IMT on all of the business units in the facility, their critical functions and numbers of employees in each unit.

After 15-20 minutes of discussion, the IMT decided to evacuate the building as a precaution. This decision was 'communicated' to our Security Center located in the building adjacent to the one with the potential roof problem. In reality, this evacuation order was called into the simulation center. This was a key point in the exercise because we were prepared to take the scenario in a number of different directions depending upon whether an evacuation was ordered and, if so, how early in the exercise.

The roof collapses

About 10 minutes after the evacuation order was given, the EOC received a call from the 'Security Center.' An out-of-breath and frantic security guard informed the IMT (over a speaker phone) that in the middle of the evacuation of the 50 people in the building at that early hour, the roof had collapsed. Several people, including the manager of facility engineering, were trapped inside.The guard told the IMT that he had called 911 and emergency personnel were on the way. He also reported that water from broken pipes had already started to freeze at the site and there was the smell of gas in the air.

At this point, the atmosphere in the EOC changed noticeably. Whereas before the IMT members were quite relaxed and mildly serious about the exercise, all of a sudden they became very serious and focused. The simulation was so real that many of them told us later that they had actually experienced some of the physiological symptoms of stress, such as a rapid heartbeat and perspiration.

As various subject matter experts on the team began trying to assess the impact of the incident and formulate alternative responses, things began to happen very quickly. First, a woman burst into the room to announce that there was a live on-the-scene report on the radio from a reporter who drove to the scene after hearing about the incident on his police scanner. The IMT listened to this 'report' which was, in reality, pre-recorded on a cassette. Also, because the manager of facility engineering was thought to be trapped in the building, his alternate, who had no advance knowledge of the exercise, had to be called to join the IMT.

Phones start ringing

Within 15-20 minutes of the roof collapse, the EOC received a number of calls in quick succession that tested various members of the team. One was from the hysterical husband of a woman who worked in the building. Another was from a woman who spoke only broken English and whose sister worked in the affected building. Still another was from the principal of a school who was trying to calm the hysterical child of a single parent who worked in the facility. In another case, an employee burst into the EOC seeking information on a relative who worked in the damaged building. In reality, these people were actors who had been recruited and coached to be emotional and unreasonable and to demand immediate action from company officials. One accused the company of being cold and uncaring and threatened to go to the newspaper with his complaint. He did and the IMT had to deal with a pesky and belligerent 'reporter' later in the exercise.

One of the more interesting calls to the EOC was from our CEO's secretary. We had recruited her to call in the event that the IMT did not notify the CEO's office of the incident in a timely manner. They didn't, so she called and told the IMT that she had heard about the roof collapse on her drive to work. The CEO was out of the country, but scheduled to call her in a half-hour. She needed to know what to tell him or to whom to re-direct his call.

At another point in the exercise, we turned on a TV in the Emergency Operations Center so the IMT could hear the latest forecast about the new storm heading up the east coast. In reality, this was a videotape of an old weather forecast obtained from a meteorologist at one of our local stations with a pre-taped voice-over relevant to our simulation. This broadcast affected the IMT's decisions about whether to close our local offices for the day and how to get our data center recovery team to our 'hot site' 300 miles to the southwest where the storm was at its zenith.

'Smoking gun' memo

Another interesting scenario was a call into the EOC from a 'reporter' who claimed to have a copy of an internal memo, dated over a year ago, that warned of possible problems with the roof which had just collapsed. The reporter had concluded that the fact that the roof collapsed meant the company had done nothing to address the safety concerns raised in the memo. He said he would fax the 'smoking gun' memo to us and then he wanted to discuss it with someone in a position of authority. A few minutes later, someone from the simulation center carried the 'fax' into the EOC and gave it to the vice president of our facilities department. This senior manager, who had been coached in advance, told the IMT that we had spent $150,000 to repair the roof the previous summer and it had subsequently passed inspection. However, he told the team that the roofing company we contracted to do the work had recently declared bankruptcy, creating a potential public relations dilemma for us.

The crisis deepens

About two hours into the scenario, 'Security' called the EOC to report that despite an earlier radio and TV announcement that the IMT had made telling employees not to report to work, many were showing up anyway. A particular problem was employees from other UNUM buildings in Portland trying to drop off their children at our child care center which is adjacent to the building with the collapsed roof. The guard needed instructions on what to tell people. He also told the IMT that since the main switchboard, also in the affected facility, was not staffed, phone calls from insurance customers were ringing in the Security Center. These calls were affecting Security's ability to deal with the emergency and they needed someone to take these calls.

Still later in the scenario, Security called the EOC to report that they had just heard about casualties from the local fire chief. Twenty-seven (27) people, all alive, had been rescued from the building and transported to area hospitals. Security gave the IMT the numbers of two hospitals.

But, when an IMT member called the 'hospitals' to inquire about the names and conditions of victims, he was told by the actors who answered the phones that no information could be released due to patient confidentiality rules.
Finally, one of our last scenarios had a real vice president of one of our business divisions call the EOC. She was expecting some important guests from out-of-town that morning. These guests were part of a team negotiating to buy one of our businesses. Among other things, the guests were here to tour our data center. The vice president was looking for guidance on what to do with these guests now.

Results of the simulation

The simulation presented the IMT with a number of significant challenges that felt very real. Considering that this was our first such exercise, the team handled these challenges very well. Collaboration and teamwork were evident, and decision making was more efficient and effective than we expected.

A post-rehearsal survey of the IMT showed that 100 percent of the participants felt that the simulation enhanced their ability to deal with an incident. Also, 93 percent felt that the simulation was realistic and provided a valuable learning experience.

Both the human resources and communications representatives on the IMT reported feeling a bit overwhelmed by the number and rapid-fire pace of the challenges that the simulation presented to them. They jokingly accused us of 'ganging up' on them. But, we told them that we had purposely loaded up the simulation with these types of issues because this is what would happen in a real incident such as this. Finally, all members of the IMT agreed that we need additional rehearsals in the future and were very enthusiastic about participating again.

The simulation was very educational for our observers, too. Some of these were senior executives without a pre-defined role on the IMT. Others were senior managers of business units who are scheduled to develop work group recovery plans, subordinate to the Incident Management Plan, in 1996.

For the top executives, the solid performance of the IMT during the rehearsal gave them increased confidence that the IMT could handle a real crisis while they could continue to manage the unaffected portions of the business.

For the business heads, the rehearsal clearly demonstrated the scope and role of the IMP and created a clear context within which their business units could begin their own recovery planning.

Conclusion

At UNUM our BRP motto is: 'hope for the best, but prepare for the worst.' Our experience has convinced us of the value of realistic simulation exercises to prepare people, particularly senior managers, to deal with serious incidents.

These incidents are not the same as the every day 'crisis management' activities in which most senior managers participate. A crisis like the one we simulated is in a different league altogether.

And a person's performance in handling day-to-day situations is not necessarily a good predictor of how he/she will react to 'the big one.' Rather, when the stakes are real, we believe in the old sports adage which suggests that 'you will play the way you practiced.'

David W. Stacy is Manager of Information Assets Protection with Guidant Corporation. Piotrek Stamieszkin is director of Business Resumption Planning for UNUM Life Insurance Company of America and Enterprise Staff Operations in UNUM Corporation.

This article adapted from Vol. 9#4.

Read 1851 times Last modified on October 11, 2012