For The Want of a Nail . . .
Commentary by Benjamin W. Tartaglia
You probably have heard the story told by the old poem, but let me retell it as accurately as I can recall
it. It goes like this:
For the want of a nail, a shoe was lost.
For the want of a shoe, the horse was lost.
For the want of a horse, the rider was lost.
For the want of a rider, the battle was lost.
For the want of a battle, the kingdom was lost.
The story describes how a seemingly inconsequential detail can lead to a disaster.
Consolidations of responsibility to save money, laying off technical and management staff during cutbacks, not taking enough time to train people in their new positions and omitting routine maintenance and testing can and has led to disastrous outcomes.
We dont know what really may have happened regarding the failure of the horse shoe, but we have seen the result. Who was to blame? The generals? The blacksmith? Did the rider knowingly take a poorly equipped steed into battle? We just dont know.
Are we doomed to repeat the mistakes of the past? If history doesnt repeat itself, circumstances with a propensity towards disaster certainly do!
On Tuesday, September 17, 1991 another shoe was lost for want of a nail, and a set of circumstances was created with a risk exposure of frightening proportions. Its another case of how missing nails could have contributed to the incredible loss in service.
Actually there were at least three (3) nails improperly installed in this particular shoe. The first was the apparent absence of proper maintenance and testing of the backup power system. Then there was a bulb, yes, a bulb, in the visual alarm system which had not been replaced when it burned out. Then there was the audio alarm which reportedly malfunctioned, whatever that means. These nails contributed to the system failure.
In this case, circular blame will be generously spread, and the accepted truth of what happened will be whatever account is repeated the most times.
What we do know is there is a trend across the nation to cut costs, increase productivity, decrease personnel and in general do-more-with-less in telecommunications. Budgets are being cut, technical and managerial positions are being eliminated, reporting relationships are being changed and responsibilities are being reassigned.
All this is being done while we are increasing our reliance on telecommunications for virtually all our business, government and personal functions.
Our increasing dependence on telecommunications and our increasing reliance on telecommunications based services will result in additional and devastating disasters. Undoubtedly some of these events will be due to missing nails in the shoes. So what can we do to minimize system failures?
Management must effectively integrate telecommunications into the disaster recovery process. Telecommunications must have the same reporting level as facilities management, security and data processing. All telecommunications should report to one responsible person, including telephones, data communications, local area networks, hard-wired data cables, intra- and inter-building cables and communication paths, remote location and long distance networks and all telecommunications supporting computer systems in the report.
Management must insist on the development of a strategic plan for disaster recovery. This plan should contain input from all parts of the organization and should have the objective of mitigating damage in a disaster. The plan should appear on the agenda of the top management meeting at least quarterly.
Finally, management must review the nails regularly. Are the visual and audible alarms working properly? Do the rectifiers work? When were the backup power systems tested, and how many hours were they run? Five or six hours or only ten minutes? How many reports of minor failures were reported, by whom and when? Are there any patterns?
What is staffing based on for telecommunications? Is it based on budget cut objectives by an inexperienced manager under pressure or is there some rationale to staffing? How about using a standard such as the number of ports, number of miles of cables, number of locations, distance between sites, number of additions modifications and deletions of terminal equipment, the degree of system management computerization, the relative difficulty of managing different systems, the number of shifts worked at a site and the experience base of the staff?
Lets not wait for an occasion to place blame. Instead lets plan effectively, using every glitch in a system or discovered loose nail as an opportunity to learn and plan better. Lets use the recent event as the impetus for reexamining our policies, procedures, job descriptions, staffing guidelines, organization charts and systems. Let our objective be to insure the nails are able to support the shoes, riders and battles.
Benjamin W. Tartaglia, MBA, CSP, is President of BWT Associates, Independent Consultants to
Management. The firm specializes in loss prevention, mitigation and disaster recovery relative to
This article adapted from Vol. 4 #4.
DR World Main Index | Return to DRJ's Homepage
Disaster Recovery Worldİ 1999, and Disaster Recovery Journalİ
1999, are copyrighted by Systems Support, Inc. All rights reserved. Reproduction
in whole or part is prohibited without the express written permission form
Systems Support, Inc.