DRJ's Spring 2019

Conference & Exhibit

Attend The #1 BC/DR Event!

Winter Journal

Volume 31, Issue 4

Full Contents Now Available!

When a disaster occurs that has wide-spread affect on a large number of businesses such as the recent blackout, one way to judge the impact is to view the number of declarations industry service providers received. In the following articles, three service providers (in alphabetical order) give an inside look at the impact the blackout had on their clients. The articles are not intended to promote a particular service or provider; instead it is an opportunity for our readers to look behind the scenes at the impact a major disaster can have on businesses large and small.

Hewlett-Packard
By BELINDA WILSON, CBCP

The pictures were all over the cable news networks. Gridlocked traffic, people walking home or sleeping in train stations, hundreds of airline passengers delayed and inconvenienced … all because of a major power outage that crippled many activities in the U.S. Northeast and much of Ontario, Canada.

We say “many activities” because one thing we didn’t see on the newscasts were major enterprises discussing the losses they were experiencing because of the blackout. Some estimates place the total loss on businesses at $6 billion. In fact, it seems that this major outage had only a minor impact. For example, Wall Street’s top financial firms reported business as usual on Friday, thanks to back-up generators and quickly implemented continuity plans.

As for our own HP clients, there was one “declared disaster” and three alerts that did not result in a declaration. The customer who declared, a financial services company, has a critical services and electronic vaulting agreement enabling them to be operational within 8 to 24 hours of their declaration, with their critical data in place. Their recovery hardware was commissioned, the operating systems were loaded and disk volumes were all configured and ready to go, all within a few hours. The client began operations at the HP Recovery Center outside Philadelphia and was quickly back up and running. Several other customers notified HP that they, too, might be running operations out of our recovery center if the outage went on much longer, but never had to actually make the move.

There were a number of other customers whose business continuity plans worked like a charm. Specifically in one case of a very large trading exchange, they had rehearsed their plans a number of times for a variety of scenarios including a power failure of the building or block but not as widespread as last week’s was. The good news is that their plan was a success. In fact, the end-users weren’t even aware that they were operating on standby systems run off diesel generators. This is quite an accomplishment toward seamless basically transparent continuity.

Preparing For The Next Event

So why did such a major unplanned event have such a small business impact? This was primarily because many enterprises have stepped up their planning for the “unplanned” since Y2K and Sept. 11, 2001. This outage provided an excellent test of their business continuity plans. From what we can now tell, most passed the test with flying colors. They can now prepare for an even better response to the next event by fine-tuning their existing plan through rehearsals, program review and improvement, and an effective change management process.

Experience shows that power-related outages are a major – and, in fact, the leading – cause of disaster declarations into recovery centers. In some cases, power outages are just that, outages caused by a direct failure of the power utility. In other cases, the Chicago tunnel system floods, for example, power outages caused many organizations to declare disasters, but the culprit was the flooding into buildings, which took out the power distribution systems. The utility company wasn’t affected, but power was out, nonetheless.

Hopefully, the blackout also served as another wake-up call to the organizations that still do not have a business continuity plan in place. It reinforced the fact that disasters can happen anytime, anyplace, and for a wide variety of reasons. The only predictable element is that the next event will be just as unpredictable as previous ones. So how should these organizations proceed?

An Ounce of Preparation Is Worth a Pound of Cure

We believe several organizations were not severely impacted because of their focused efforts in having effective business continuity plans. DRI International, a non-profit organization of professionals in the business continuity, crisis management, and emergency response, recognizes the importance of preparedness and planning through the use of its Professional Practices. This 10-step model allows practitioners to build effective strategies and plans that will mitigate risks, identify business impact and develop continuity and recovery plans. A key component of these best practices is to rehearse the plan, both announced and unannounced, for a wide variety of circumstances. Experience tells us that the more a plan is rehearsed, the more committed the company is with higher degrees of confidence. You can get additional details of DRII at www.drii.org.

What Can We Learn From the 2003 Blackout?

If we use this disaster as an opportunity, we can learn some very serious points that we may not have even considered a month ago.

• We learned how interdependent and connected our power grid system is today. The vulnerability of an antiquated system provided the foundation for the cascading effect of power outages across the grid.
• Power outages cause more disruptions and problems than we likely considered. The blackout affected our entire societal infrastructure impacting traffic, logistics, public safety, and telephone systems.
• Our water supply is dependent upon electrical power – from the treatment plants through distribution. Most cities only have a 24-hour supply during a power loss. That supply has to be available and cannot be re-routed for emergencies such as fires.
• The geographic location of recovery centers relative to operations and offices needs to be examined, especially if they are within the same power grid.

How Can We Be Better Prepared?

Probably the biggest lesson learned by many companies affected by this power failure was regarding their ability to be prepared and proactive versus being reactive to such circumstances.

Other preparatory steps you might consider include:

• Know how the power comes into any facility. What are the sources? How many power feeds are there connected to each facility? What is the history of outages in any particular region?
• Ensure there is a UPS solution (battery, generators, or both) for all critical systems, including those in the general office environment. These would provide a solution for PC’s, monitors, phone switches, printers, and servers. Understand what areas of a facility have emergency power backup sources for lighting, pumps, and elevators.
• Have a plan with a timeline in place for shutting down operations if a power outage exceeds the limit of your alternate power supply such as battery backup or generators. Prioritize which operations may be shut down immediately that are not as critical to maintaining business. Put in place a conclusive time at which all operations may have to be shut down before the entire power supply cuts out.
• Consider how geographically dispersed your back-up facilities are. Are they on the same power grid? Are they within the same region where similar bad weather patterns or other natural disasters may strike? Many companies learned that building their backup facilities and alternative work area spaces across a river was not sufficient during a major power failure. Simply backing up data to redundant servers offsite for storage is also not necessarily an adequate solution.

Plan Maintenance and Rehearsals

How often do companies check their business continuity plans for thoroughness, updating, and maintenance? Many companies build their continuity plan, perhaps test it once, and then it may be stored on a shelf collecting dust. Some companies were caught because of simple routine maintenance that wasn’t performed that could have prevented any problems and enabled them to have smoothly operating recovery. Problems that were experienced by some customers included insufficient cooling capacity, leaky fuel tanks, malfunctioning generators, and a lack of diesel fuel.

There are also some routine exercises that can be periodically performed, such as call-tree exercises. Who are the primary, secondary, and ultimate emergency contacts? Do we have everyone’s most updated contact information? Do we know how to contact utility providers, government officials, and emergency management personnel? What is the response time for each before the next person on the list is contacted? Who is responsible for distribution and keeping all this information valid? Will these people have transportation available to reach the facilities? Some companies initiate random disaster exercises to validate and update their plans. Building a mock exercise for a power failure into your plans may help to better define it.

In Summary

Few IT budgets, still today, have sufficient funding for business continuity efforts. Companies need to consider the costs across their business, not only related to technology, but including lost productivity, inconvenience to customers and suppliers and possible damage to company reputation. For some businesses, the blackout reflected on the methods used in backing up data and how long it would take them to restore that data – if at all. How can you calculate the cost of lost or corrupted data? Is the time it takes to restore data equally as important as restoring IT operations and worker productivity? For many businesses, these factors are closely interlocked. This is why risk assessment and risk management have become an important part of building a business continuity plan.

Finally, but very important, the most effective thing most companies can do immediately following such an event is to review what happened, what went well and what didn’t, and identify what they could have done better.
If you have any questions about this article or Business Continuity & Availability Solutions, please contact HP at 1-800-863-5360 or via e-mail at This email address is being protected from spambots. You need JavaScript enabled to view it..


Belinda Wilson, CBCP, is the director of Worldwide Business Continuity & Availability Solutions (www.hp.com/go/businesscontinuity) and also the vice-chairman of DRI International. She has 18 years of business continuity experience, is a certified professional.


IBM Business Continuity and Recovery Services
By PATRICK CORCORAN

On Aug. 14, within minutes of the initial power excursion that affected the Northeast and Central U.S. and Canada area, IBM’s Business Continuity and Recovery Services’ Emergency Operations Center in Sterling Forest, N.Y., declared code “Business Condition Red.” This business condition is an IBM first-action response to a severe threat to its customers’ ability to maintain business operations.

As a result of the blackout, thousands of companies across a significant portion of North America lost two critical elements of their businesses: power and telecommunications. IBM used its resilient electrical infrastructure to maintain continuous power throughout the event to support those clients who declared a disaster.

Managed by the emergency operations center (EOC), IBM continuously monitored the status and technology operations of all IBM recovery sites to ensure all client demands were met. With the unstable telecommunications in the affected area, IBM transferred the call center and EOC to its alternate facility in Boulder, Colo. This assured a stable line of communication allowing clients to contact IBM to declare a disaster. Many IBM employees – some of whom volunteered to help while on vacation or out of the office – embarked on around-the-clock shifts to assist customers throughout the blackout.

Many of IBM’s largest customers had emergency power reserves on their own premises. However, significant demand for emergency services was requested by mid-sized clients who were in need of multivendor system recovery services as well as end-user workspaces (a desk, computer, copy facilities, and a connection to the home network). The need for workplace recovery services was elevated and requested by many clients, regardless of their enterprise size. Invocation of IBM’s services began within minutes of the power failure and the first customers arrived within a few short hours. Within hours, many clients arrived at a number of IBM facilities dedicated to provisioning multivendor IT recovery services. Clients placed in Sterling Forest, N.Y.; Toronto, Canada; Gaithersburg, Md.; and other locations began to execute the recovery programs they have exercised successfully over periods of up to many years.

Given the breadth of the impact of this outage – in many respects it was multi-regional in nature – a nationwide instant messaging conference was used by the emergency operations team to provide a virtual connection between all IBM Business Continuity and Recovery Services centers in North America. Emergency operations status conference calls for the United States and Canadian teams were held every four hours, to review status of customer response plans and to deploy additional resources if needed. The IBM teams in the United States and Canada remained in “Business Condition Red” through the weekend and into early evening Monday.

By Monday, Aug. 18, the crisis was clearly under control by public authorities, and power had been restored to many areas that had been impacted. The IBM EOC moved to “Business Condition Yellow” and continued to monitor the situation for any changes.

“This is a reawakening for the business leadership,” said Don DeMarco, vice president for IBM Global Services’ Business Continuity and Recovery Services. “Once again, clients have been exposed to ‘information-based risk.’ Previous crises like Hurricane Floyd in 1999 and the tragedy of 9/11 demonstrated that information has to be an essential component of corporate risk management.

With each successive crisis and, let’s hope they’re few and far between, our clients are better prepared.”

With each widespread emergency event, awareness of the issue of business continuity and decisions regarding information-based risk come front and center to business executives. More and more firms are focusing on these concerns and recognizing that information-based risk mitigation is not exclusively the responsibility of the IT management chain. Business leadership, with the help of IBM’s Business Resilience and Continuity worldwide consulting practice, are recognizing that policies must be in place to address four levels of intervening governance:

1) Government regulations which require strict compliance.
2) Supply chain and third party resiliency demands of which the firm must show evidence of compliance.
3) Enterprise-wide corporate governance driven and often managed by senior executives and board of directors designed to protect the best interest of the shareholders.
4) Business unit or division level governance to assure resilience within a business unit to create competitive advantage in the marketplace.

As an example, the Securities and Exchange Commission, Federal Reserve Board, and the Office of the Comptroller of the Currency have been working to develop a set of recommended practices (they have received input from IBM and other companies) to strengthen the resiliency of the U.S. financial system. The Federal Reserve Bank of New York also participated in this initiative, which involves the drafting of a white paper which identifies specific, new business continuity objectives that have special importance in the post-Sept. 11 risk environment for all financial firms.

As early as this fall, you can also expect IBM to take a leadership role among IT services firms in proposing similar recommendations for all businesses – including those outside the financial sector.

In short, companies have started to realize that they participate in a greater ecosystem – and that their IT systems are only as resilient as the firms that they rely on to stay in business.

For more information, please visit our Web site: www.ibm.com/services/continuity.


Patrick Corcoran is the manager of global marketing and business development for IBM Global Services, Business Continuity & Recovery Services.

SunGard Availability Services
By JUDITH ECKLES

The power outage that hit the Northeast and Canada in mid-August served as another wake-up call for businesses that haven’t developed solid business continuity plans. Similar to the events of 9/11, the power outage was a sudden, devastating regional event that wreaked havoc on the Northeast United States business infrastructure. In some ways, the blackout created even greater challenges for SunGard Availability Services than 9/11.

In its 25 years in business, SunGard Availability Services has helped many companies get through regional disasters. However, planning for information availability changed following 9/11. Many companies began looking at backup facilities several hundred miles away, never anticipating that both primary and secondary sites might be affected by the same event.

The size of an organization doesn’t always equate to how well the organization has planned for information availability. A company fares better during a disruption if its business continuity team has taken time to test its plan. Last week’s power outage demonstrated that once again. While everyone recovered, those that had aggressive testing programs in place recovered more efficiently.

Those customers who have engineered information availability and host their mission-critical applications at a SunGard facility were not affected by the power outage due to the reliability of SunGard’s uninterrupted power.

Managing the Crisis

The blackout became a reality for hundreds of SunGard Availability Services’ customers just after 4 p.m. on Aug. 14. The company acted immediately and convened its dedicated crisis management team just minutes later, at 4:11 p.m. The Crisis Management Center was up and running at 4:30 p.m.

As a result of the blackout’s wide reach, the number of companies affected throughout the U.S. was significant. However, SunGard Availability Services’ staff and facilities were up for the challenge. More than 300 calls were received in the first hour following the blackout. At final count, 166 customers put SunGard on alert and 66 companies made disaster declarations.

The blackout was the second largest regional disaster that SunGard has ever supported. Hurricane Floyd produced more alerts (189) and the attacks of Sept.11, 2001, produced more disaster declarations (121).

Within the first eight hours following the power outage, SunGard allocated 1,600 end user seats just in its Northeast facilities and had contact with 1,200 customers. SunGard supported the more than 1,000 customers who ultimately utilized their various recovery facilities. Over the entire coverage of the outage, a total of 2,000 end-user seats were allocated across the eight facilities.

Additionally, SunGard supported numerous recovery efforts for mainframe and midrange platforms in addition to the work groups.

More than 30 members of the SunGard Availability Services staff worked more than 24 hours straight to assist customers in making full recoveries.

“This recovery effort was a testament to the capabilities of SunGard Availability Services and its commitment to helping customers achieve information availability,” said Jim Simmons, CEO of SunGard Availability Services. “The power outage has served once again to demonstrate the importance of information availability.”

Lessons Learned?

Unfortunately, many businesses hadn’t acted on the lessons that should have been learned from 9/11 and were put in a challenging position due to the blackout. In fact, a Harris Poll study found that while Corporate America has made some significant strides to protect information availability against disaster, there are still areas of businesses that remain at risk.

The study found that two in three executives said their company was more prepared to access business-critical information than they were prior to Sept. 11. However, those same executives reported that only 58 percent of their companies currently have disaster preparedness training for employees who deal with information access. In addition, executives only gave their companies a grade of “C+” when it comes to their ability to access business-critical information quickly after a disaster.

“In the financial services industry, for example, where minutes of downtime can cost millions of dollars, a “C+” is a failing grade,” said Simmons.


Judith Eckles is the senior director of special projects for SunGard Availability Services. She was a founding member and former chairperson of the DRJ Editorial Advisory Board, the first vendor representative and the first woman to serve on DRII Board of Directors, and a 20-year member of the Public Relations Society of America.