Testing Pays Off For Penn Mutual
- Published on Monday, 29 October 2007 01:47
The basic premise of disaster recovery is that a tested plan is the only way to recover. Penn Mutual Life Insurance Company recovered from the fire because, one, they had tested their plan many times prior to the fire, and two, they executed their disaster recovery plan with expertise and precision.
The alarm was pulled shortly after 4 p.m. on Tuesday, May 30, 1989, when smoke was discovered in the records room on the ninth floor. The fire raced through the ninth floor of Penn Mutual's 530 Walnut Street building in downtown Philadelphia, destroying thousands of documents. At times the temperature hit 2,000 degrees and fire spread quickly among the room's largely paper contents. By early Wednesday morning, it had escalated to a 9 1/2 alarm fire and eventually as many as 500 firefighters were required at the scene. The fire had displaced about 1,500 employees from various companies in the building. Arson is suspected and a reward has been offered.
THE QUICK RESPONSE
We interviewed Paul Trainor, Vice President of Information Systems, about the fire and recovery. The fire was on the ninth floor which is two floors above the data center. The fire had started in the Penn Mutual records center and continued to burn for two days.
The firefighters were pouring 12 million gallons of water on the fire, and this flowed down to the data center destroying the ceiling tiles and causing severe water damage to the computer equipment. The plastic sheets used to cover the equipment were ineffective due to the enormous amounts of water.
At approximately 9 p.m., Mr. Trainor decided they could not continue and declared a disaster to SunGard Recovery Services. (SunGard Recovery Services provides alternate data processing facilities and services in the event of a computer disaster). Penn Mutual's backup tapes arrived at SunGard's Philadelphia Recovery Center at 1 a.m. and at about 9 a.m. the data was restored and by 11:55 a.m. Wednesday morning, they had every application up and running with the exception of two minor internal tacking systems, which were brought up within two hours.
Due to the complexities of partial backups, Penn Mutual had changed their philosophy from partial backups and defining critical applications, to performing full backups.
Their goal was to restore the system and begin operation at the recovery site within 24 hours. As a result of the testing they had done previously, and with the help of SunGard's skilled professionals, they were able to recover the operating environment and key applications within 13 hours.
Nearly two years ago, Penn Mutual moved most of its business staff out of the Philadelphia location to a suburban site 20 miles away. They had to establish communications to those offices. The company uses a T1 circuit and dial backup alternatives to communicate with nationwide agency offices in a recovery mode. The communications equipment at SunGard were able to handle all of Penn Mutual's communication needs. They used SunGard's SunNet II modems locally and sent others to the outlying branches.
Mr. Trainor stated they had regularly tested their recovery capability and had just completed a test two weeks prior to the fire. The test also familiarized Penn Mutual with SunGard's facilities and personnel. He also commented, "If it hadn't been for our vigorous testing program, we would have had an extended outage. There's no question of that!"
The effect of the fire on the other parts of the company was minimal. Ninety eight percent of the administrative and customer relation functions and personnel were at other locations.
THE COMMAND CENTER
The command center set up at SunGard was the single point of contact for the outside world. All calls came through command center personnel and could be handled in an orderly manner. Questions were handled about the fire, the data processing center and how long operations would be down. The command center was a place to implement the recovery plan.
THE MOVE TO THE COLD SITE
It was apparent to Penn Mutual that the outage was going to be long term and that they needed to start the transition to SunGard's cold site. A major problem, according to Mr. Trainor, was that he had to acquire a complete data center in a relatively short period of time. They had to find and acquire 170 gigabytes of DASD. Short-term leasing is very expensive. Mr. Trainor said, "The first couple of days you are in total shock. Then you realize that you have to populate your cold site. The main question at that point was which vendors could deliver on time and which ones could not. In general, the larger equipment suppliers all did an exquisite job and some of the smaller ones did not."
KEY ISSUE FOR RECONSTRUCTION
A key issue that must be addressed when reconstructing your DP environment is determining the insurance settlement. If your equipment is not totally destroyed by fire, you might not get full settlement for the equipment, but yet you still have to acquire equipment for the cold site.
There have been many important issues mentioned in this article. Penn Mutual stresses that the major reasons for their successful recovery were: (1) preparing a disaster recovery plan (2) subscribing to SunGard (3) testing, testing and more testing.
Thanks to Penn Mutual's efforts and their comprehensive, thoroughly tested recovery plan implemented by Penn Mutual, in conjunction with SunGard's professional staff, they recovered successfully from what could have been a devastating disaster.
This article was written by Richard Arnold, editor-in-chief, Disaster Recovery Journal.
This article adapted from Vol. 2, No. 3, p. 4.