Three years ago I asked, “What if a disaster shut down access to Valley Forge?” At the time, no one expected such a catastrophe to happen, but that question led to AmeriGas’ development of a disaster recovery and business continuity team that established a back-up plan. Begun in 2006, the plan spanned IT and critical business function recovery, leveraged third-party recovery space, and included annual testing. The goal was simple – to “minimize the impact to customers, employees, and cash flow.”
At 6 p.m. on Wednesday, Dec. 16, 2009, all of AmeriGas’ preparation was put to the test as fire broke out at the joint headquarters of AmeriGas and AmeriGas’ parent company UGI. As a result of prior preparation and the general “can do” attitude that permeates the AmeriGas organization, recovery teams were able to avert any material impact to AmeriGas’ field operations. This article documents the incident, response, and lessons learned for the purpose of sharing and helping other companies consider how they may establish a pragmatic yet effective approach to continuity.
In 2006, AmeriGas had begun to address business continuity through the identification of a back-up scenario for a critical enterprise transactional system. Not unlike most companies, AmeriGas had begun to address continuity of operations through information technology. Through a gradual and focused approach, critical systems were added to scope, offsite recovery space was identified and contracted, and multiple disaster recovery tests were executed. Each time, AmeriGas IT focused on improving their ability to recovery critical systems in preparation for a disaster.
Additionally, AmeriGas expanded the continuity program to the finance group, developing key finance function recovery scripts in 2007 and beginning recovery exercises in 2008. Two such exercises were completed, one in 2008 and one in 2009, with all critical personnel traveling to the off-site recovery space and transacting business after IT had recovered systems. While only two such exercises had been completed prior to the incident in December 2009, the experience gained during those exercises proved to be invaluable to the organization according to William Stanczak, controller and chief accounting officer who noted “exercising helped them prepare mentally for recovery.” AmeriGas had focused on continuously improving plans, achieving 144 of 148 business objectives in the 2009 business continuity workspace recovery test which occurred roughly six months before the fire incident.
As with any incident in which normal operation is disrupted, the effects of the incident spanned further than simple technology and business processes. The human element and sense of loss that AmeriGas employees felt was an element that needed to be addressed at the onset of the incident.
“During our town hall meetings, some of the questions we found ourselves dealing with related to personal affects in offices, plants and photos, and similar items,” said AmeriGas Chief Financial Officer Jerry Sheridan. “We really didn’t consider that personal sense of loss that was felt by many, nor did we have a defined plan regarding how to deal with it.”
In keeping with AmeriGas’ familial office culture, senior leadership quickly established reimbursement for incidentals that might have been lost during the fire and held frequent town hall meetings.
After IT had recovered critical systems, mission-critical employees were first to arrive at the recovery site to transact work. The rapid timing of “business as usual” was remarkable given the loss of the 460 building; within 24 hours of disaster declaration, dedicated recovery workspace accommodated a full complement of 160 seats which was far more than originally planned. Well-organized communications between the AmeriGas and the recovery site dedicated teams ensured appropriate workspace and daily supply needs were promptly met. This partnership was key to AmeriGas as the loss of the 460 building created a void for employees emotionally, and people simply wanted to be involved in the recovery.
While 30-50 people had been familiar with the recovery location from prior tests, they had not anticipated actually working out of the interim space for an extended period of time. Small quarters coupled with the increase in personnel and the related fast pace of consumption of office and other supplies caused morale at the recovery site to be monitored very closely by AmeriGas recovery leadership.
Additionally, some of the equipment issues that had been identified in the business continuity workspace recovery tests had not yet been addressed; as a result, business operations were slowed due to having to make do with fax machines in place of high-speed printers. Such elements were considered to be typical lessons learned for a situation of this type.
The First 72 Hours
John Iannarelli is used to working late and, as such, it was no surprise that he was onsite when the fire broke out on Dec. 16, 2009. As the chair of the AmeriGas business continuity team in 2009, Iannarelli was intimately aware of the incremental and continuous improvement that AmeriGas had made leading up to the incident.
The first 72 hours were well coordinated, but Iannarelli admits they “really hadn’t planned crisis management past that.” The short-term coordination of AmeriGas leadership led AmeriGas to experience many of the details that can be overlooked when crisis planning. Items such as conference bridge lines, up-to-date phone lists, and meeting space were all worked through well by AmeriGas at the time of incident; but having been through the experience, Iannarelli noted that there is some planning that should be done to ensure they know “what to do, who to do it, and how to be effective quickly.”
In those critical first 72 hours, the AmeriGas organization rallied, specific recovery teams knew their roles and stepped up valiantly, and AmeriGas’ field operations which drive the revenue for the business were unaffected.
Getting Back to a New Normal
After the initial crisis response period, and upon learning the true extent of the fire damage, it became apparent to AmeriGas senior leadership that a recovery team needed to be established with a single program director for coordination of all of the key workstreams. Steve Kossuth, AmeriGas procurement director, was named into that role and worked to coordinate facilities, IT, desktop computing, telecommunications and fax, and applications teams. Soon into the role Kossuth found that communication across the teams was essential due to the rapid timeframes and sheer volume of work to be completed. Additionally, it was clear that senior operating management needed to be shielded from the day to day recovery activities in order to focus on business operation. Kossuth found himself acting as a buffer and translator for recovery activities (many of them technical and detailed) while allowing recovery teams to function and senior management to stay apprised of progress.
Confronted with an aggressive goal to have all AmeriGas employees back into an office location and functioning as soon after Jan. 1, 2010, as possible, the recovery team mobilized immediately and reached out to key vendors.
“Suppliers stepped up and worked side by side with AmeriGas – they put in long hours up to 16 hours some days and worked with us on holidays (to help with the recovery),” said Kossuth. He believes the make-up of the recovery team and the fact that everyone felt recovery was a “personal responsibility” was a major factor in the achievement of the aggressive goal of having all AmeriGas employees into the recovery space and functioning by Jan. 5, 2010, less than three weeks after the fire incident.
Having been the focus of the start of the continuity program at AmeriGas, information technology was arguably the most prepared to respond to a disaster. Martin Gibbins, director of technical services, feels that the success of the recovery comes down to two simple factors. “We had a good plan and we were able to execute it.” While that seems a bit understated, Gibbins feels that that really was the difference between the successful and low to no-impact felt by the field and the potential catastrophic situation that could have ensued if no plan was in place. “We tested well, raising the bar each time,” Gibbins noted.
Aside from the execution of the technical recovery, Gibbins notes that clear communication to employees was critical. Some people were getting information from the press and news outlets rather than through AmeriGas leadership, and thus were receiving differing information. Town hall meetings were excellent ways to clear up confusion, put employees at ease, and reassure them that AmeriGas had a plan and was executing it well.
Additionally, Gibbins notes how noticeable it was to see those who had tested versus those who had not. Practice of the technical recovery plans allowed individuals to focus on their tasks at hand rather than expect them to think through appropriate next steps in the fog of the recent incident.
Lastly, Gibbins notes how the situation changed some of the mindset of the IT organization – moving the culture more toward the concept of making critical decisions based on “80 percent of the information” rather than waiting for 100 percent of the knowledge to be obtained. Gibbins feels that this has instilled a more nimble and business-focused approach to the IT group.
While the recovery efforts were extremely successful, AmeriGas feels that there were many lessons to be learned from the incident and has already begun to address some critical elements of continuity in order to be even more prepared for a possible, although hopefully neverto- arrive, future incident.
Below are some lessons AmeriGas has noted as a result of the fire incident:
- Start with a manageable yet meaningful scope (critical systems) for recovery plans and tests and increase the scope incrementally .
- Don’t limit exposure to the recovery plan(s) and recovery site(s) to those who are immediately critical. Allow other members of the organization to take a trip to the recovery site, to understand what plans exist, and what would happen to mobilize after an incident.
- Ensure adequate space is reserved for key personnel at the recovery site.
- Connectivity is critical.
- Key peripherals are critical and need to be included in recovery plans.
- Relationships with key vendors are crucial to success.
- Clear and consistent communication is critical.
- Crisis management planning should consider long-term displacement, not just a few hours or days; AmeriGas’ plans did not contain any provisions, instructions, or contact information to assist in interim site procurement.
- Ensure that critical office supplies are considered; for example, if you use highspeed printers in a given department ensure that you have those as part of your recovery or have a quick-ship arrangement to get them.
- Consider the human element of disaster; designate resources to deal solely with the personal impact and morale of the workforce.
- Consider engaging critical vendors about their recovery plans and how they could support you at time of disaster. Consider other vendors that might be “ancillary” and engage them as well (e.g., office furniture, wiring, office supplies, etc.).
- Don’t undersize your recovery location – it’s difficult to ask office personnel to work from home for an extended period of time.
- Enforce policies related to shared drives and files; work to ensure critical files are not on employee HDDs but are in fact included on backed-up servers.
- Ensure connectivity in/out of the recovery site is adequate for real work volumes.
- Plans will cover most of what you need to address for recovery – your people will cover the missing elements.
- Establish clear and rapid communication protocols for the workforce and choose a person to lead this effort immediately after the incident; as social media becomes more prevalent, misinformation can cause a severe impact on morale.
Rick Fabrizio has been vice president of information technology and chief information officer of AmeriGas Partners, L.P. since October 2005. AmeriGas is a $2.2 billion national marketer and distributor of propane, propane equipment, and related services headquartered in Valley Forge, Penn. Prior to joining AmeriGas, Fabrizio served as director and chief information officer of PQ Corporation, which is an industry-leading global chemicals and engineered glass materials company. Prior to PQ Corporation, he served in various IT roles at Campbell Soup Company. Fabrizio’s career spans all realms of information technology including IT strategic planning; ERP selection and implementation; IT organization turnaround; and IT business alignment and transformation.