When it Rains It Freezes:
Canadian Company Battles Northern Exposure
by Judith L. Eckles
Living at latitude 45°N means winters can be almost Hobbesian nasty, brutish, and in this case, long. But even by Canadian standards, the storms that kicked off the new year north of the border were particularly wicked, giving a new meaning to the old Bob Dylan lyric "a hard rains a-gonna fall."
Days of non-stop rain and temperatures hovering around freezing left southern Quebec and eastern Ontario blanketed by several inches of solid ice, halting virtually all travel, shutting down businesses, cutting off power to more than three million residents, and socking the Canadian economy with business losses estimated at $1.1 billion, or 0.2 percent of GDP.
It couldve been worse.
January 5: A disaster in the making
"On January 5, we lost electrical power in the building," says Guy Chamberland, Corporate Director - IT for Domco Inc.
Domco is a leading North American manufacturer of vinyl floor coverings for commercial and residential markets. The company is headquartered about 30 miles southeast of Montreal in Farnham, Quebec.
Domco has a production facility in Farnham, as well as two more in the U.S., half a dozen distribution centers operated by its Domcor division in Canada, and a major customer service operation in Alabama. In all, there are more than a dozen sites in the U.S. and Canada, and all of them are networked into an AS/400 in Farnham.
Like most businesses in Farnham, Domco practically closed its doors when the ice storm hit. For the few employees who could navigate the icy streets which were littered with stranded cars and fallen trees and utility poles a cold office awaited because, of course, the power had been cut off. And besides, the prospect of a month or more with no electricity at home was more than enough for most employees to worry about.
However, Chamberland, who also leads Domcos disaster recovery team, was prepared for a power outage. "We had a generator working, so we were still able to operate." And indeed, 500 users were still able to access the companys AS/400, which handles everything from production planning to order entry and fulfillment, and without which, Chamberland says matter-of-factly, Domco would be "out of business."
What Chamberland wasnt prepared for what no one in the province was prepared for was the accumulation of ice.
January 7: The ice tightens its grip
"We had a telecom wire running from the street to our building," Chamberland says, "and we expected that to be up." However, as the ice built up, it became clear that that expectation might require revision. The telecommunications line is critical because it links order entry, shipping, and invoicing from Domcos six distribution locations across Canada to the AS/400 in Farnham. The line is also Domcos principal connection to major operations in Florence, Alabama, and Houston, Texas. In fact, Chamberland was in Florence the week the disaster struck and had great difficulty reaching his disaster team in Farnham. He barely managed to get back to Farnham on Saturday, January 10.
Domco had installed a platform to support the telecommunications wire from the street into their building. But a chain is only as strong as its weakest link, which in this case was the telephone pole. And by January 8, under the weight of the ice, telephone poles in Farnham were snapping like toothpicks.
January 9: The inevitable
Losing the telecommunications line seemed inevitable, and it was. A day later, on January 9, Domco lost its telecommunications line. That evening, Jean-Guy Lafond, a member of Domcos technical support staff, called SunGard Recovery Services and declared a disaster.
"We made full system backups," says Chamberland, "and Jean-Guy Lafond and Patrick Dubois drove these and our data backups to the airport." But not the Montreal airport, even though thats the designated airport in their recovery plan.
Nothing was flying in or out of Montreal, says Chamberland, so "they kept driving until they reached an airport that hadnt been closed by the ice storm. They finally took off from Burlington, Vermont, about an hour and a half from Montreal, and flew to Philadelphia."
January 10: The recovery begins
By the time Domcos disaster recovery team arrived in Philadelphia Saturday evening, SunGards own recovery team had already been at work most of the day.
"We assigned an AS/400 for Domco to use for the recovery," says Bob Parker, SunGards Supervisor of Operations for IBM AS/400 and RS/6000 platforms. "We initialized the operating system, initialized 60 gigabytes DASD, and established the necessary communications lines."
Parker and the rest of the SunGard team assigned to Domcos recovery had a head start. "I did a workshop with Domco," says Parker, "and we ran a very successful test with them in March 97, so we were familiar with their system and their objectives."
By 8:15 p.m., Lafond and Dubois had completed overlaying Domcos own operating system and microcode. They then started loading data, and by 5:30 the next morning, theyd finished restoring the system. Domco IPLd the system at noon on Sunday, January 11 less than 36 hours after declaring a disaster.
January 11: The second full day of the recovery
Sunday started a full week of 9 a.m. to 9 p.m. shifts for the two Domco employees, who turned over the reins to SunGards recovery team for the stretch between 9 p.m. and 9 a.m.
The next step for the recovery team was to establish communications with Domcos Florence and Houston operations.
"The communications setup went smoothly," says Parker. "There was a minor problem with controllers, but these were resolved very quickly."
The problem was with Domcos Frame Relay Access Device, or FRAD, says Charles Ernst, SunGards Supervisor of Network Operations. "They had a dedicated 56k frame relay circuit coming into the FRAD from MCI, and they ran two com ports off the FRAD to the AS/400. But initially there was some trouble with the FRAD because some recent changes in Domcos production environment hadnt been accounted for in their recovery configuration."
The dial-ups were working right away, so while Ernst and his team worked through the FRAD problem, Domco had its people dial in.
"We changed resource names for the frame relay and shipped a workstation controller to Domcos Farnham office," says Ernst, "but by the time a technician arrived from Toronto, the controller had already been fixed."
January 12: Houston, we dont have a problem
It took Memotec about a day to get the FRAD configured and set up, and from that point on the morning of Jan. 12 Domco was able to run production out of SunGards recovery facility.
Chamberland recalls that by midday Domco was "up and running at several sites. Monday afternoon, sites were coming back, one after another. One thing that was special, though, was that although our other sites could reach SunGard, we couldnt reach SunGard from Farnham for the first week." Thats because the telecom system in Montreal was in tatters. Domcos plan was to call in to have access to the frame relay, and let the locals operate off the local AS/400.
January 16: Homeward bound
By the Friday following the disaster declaration, the recovery at SunGard was going so smoothly that Domco recalled its two recovery specialists from SunGards Philadelphia facility. However, because power back in Farnham was still unreliable, Domco continued running operations off SunGards AS/400.
Starting January 16, says Chamberland, "we were running the AS/400 remotely using PC Anywhere."
"We called them at the beginning of each shift," Parker says. "Wed let Domco know who was going to be on duty for us and whether there were any issues." SunGard also ran a daily backup routine for Domcos data.
January 31: The thaw commences
"Electrical power was back in Farnham by the end of January," Chamberland says, "so we were without our main power from January 6 to the end of the month. During that time, we operated by generator. But those generators arent geared to operate for days at a time, so ours was breaking down and requiring constant maintenance. Our IT was probably running around 90 percent in Farnham, but total operations were probably at 50 percent of normal business. However, starting January 12, locations in the United States and across the rest of Canada were able to operate through SunGard without any problems.
"We had the recovery team for the first two weeks only," he says. "After that, we were operating remotely from Farnham with SunGards help. And by February 7 or 8, we were able to bring computer operations back to Farnham."
March 10: The aftermath
"I have to raise my hat to the recovery team," says Chamberland from his (fully electrified) office in Domcos Farnham headquarters. "They left their families to take care of the recovery, and theyve been very willing to help."
Now, he says, "Were probably back to normal operations normal life. Its behind us. Its really been an interesting experience. Everybody understands this has been quite an experience.
"When we do our debriefing, well ask ourselves if we could have done anything differently," Chamberland says. "But I think it would have been difficult to convince management that we needed to prepare for a massive ice storm." In other words, they responded flawlessly to the unforeseeable.
When a recovery goes this smoothly, the disaster often goes unnoticed, which is good. Chamberland says that few people outside his group realized that Domco was operating in a disaster mode.
However, the recovery efforts didnt go completely unnoticed. "Upper management was very appreciative," says Chamberland, "and Domcos president has personally rewarded several employees from the IT department for their contributions during the outage."

Judith Eckles is Director of Marketing Communications for SunGard Recovery Services and has been with the company since 1990. She is the immediate past Chairperson for the Disaster Recovery Journals Editorial Advisory Board and was recently appointed to the Board of Directors of the Disaster Recovery Institute International, serving on a newly formed marketing committee.

|Return to the Spring 1998 Index | Send Email to DRJ |

Copyright (c) 1995 Systems Support Inc.. All rights reserved.
Reproduction in whole or in part
in any form or medium without the express written permission of System Support Inc. is
prohibited.

Page Designed by David-Glen Smith


Last Updated--Wed. April 29, 1998.