Don't check into the Heartbreak Hotel
without an aspirin
On a flight recently I sat next to a professional in the pharmacy business with a heartbreaking story. He was lamenting the fact that suddenly his laptop hard drive had become corrupted. All efforts to recover the data had been met with failure. The only solution was to reformat the drive and begin again. The saddest part was that he had lost five years of work stored on his hard drive. Can you imagine a laptop running for five years without a corrupted file? Surely you and I would have purchased and employed a tape backup long ago. Well, you better believe that he has a tape backup now, but that is of little consolation. I rubbed salt into his wound when I told him that companies exist that can recover his data, had he not reformatted the drive and destroyed all hope.
Detail shop hurt by
neglecting the details
My almost-new white Porsche needed some side molding and striping, so off we went to the local automobile graphic arts shop. This place was noted for its colorful and exquisite Mylar auto designs. Over the years the artistic owner had created and stored his livelihood on a computer. The computer was cleverly connected to a plotter that would cut out his designs into Mylar patterns. Tragically, the previous week, his computer died during a lightning storm, taking his painstaking designs with it. I asked him if he had backed up his data. The look on his face told the pitiful story.
Examine your business continuity principles
All these sad stories emphasize a lack of business continuity awareness. The issues that revolve around the ability to recover from a disaster, or continue to operate the business in spite of an operational disruption, are directly related to your business continuity principles. Are they automated, or can 'human error' crop up. Among some possible scenarios that can change a minor glitch into a show stopper:
- you have not backed your data recently
- you have the information but you cannot find it
- you found the information - but the wrong tape was sent back from your off-site location
- your information is too old to be of value
- you lost the source of your application programs
- you lack recovery capability because backups from your old system can not restore to the new platform.
These are typical business continuity issues. If you maintain discipline in your business continuity processes, your recovery becomes an automated exercise and relatively easy to control.
A customer advisory board (CAB) conducted by StorageTek in 1994 polled users and potential users of disaster recovery and business resumption services ( 'What are your key requirements?') By an overwhelming majority, the foremost requirement was to get critical data off-site. Most companies use what we call the FTAM approach, that is, the Ford Truck Access Method. This method is sound, because it assures physical separation of your data from the point of possible problem. However, it imposes a logical separation of you from your data. As soon as the truck leaves your facility with the backup tapes, the data is no longer dynamically accessible. Thus, in the event of a head crash, when you need you data back quickly, it takes considerable time. Certainly, the issue is timeliness because you are logically separated from your data by hours or days.
CAB participants also wanted high data availability, data integrity and to elimination of manual processes. They rightly perceived these manual processes as expensive, time consuming and error prone. Most recognized that problems are going to occur, yet the singular issue remains: 'Will the next problem grow into a disaster or be logged as a slight inconvenience?'
Consider this: While mainframe business continuity has matured (i.e., point of impact recovery, remote data mirroring, electronic vaulting, automatic application switch-over and the latest emerging technologies), business continuity on tactical systems continues to lag behind. Three questions spring from this enigma: 'Does mainframe-equivalent recovery technology not exist for Tactical Computing?' Interestingly, this situation cannot be blamed on limited technology availability for non-mainframe platforms.
Then, you might ask, 'Are the applications running on Tactical Computers less crucial to the corporation?' Not at all, non-mainframe critical applications range from ATM machine servicing, to local health care, to legal and financial interests to Internet Web page marketing. 'Are Tactical Computer users less sophisticated than mainframe users?' Not exactly, but they do suffer from an inherent disability, not felt by mainframe users. Mainframe users benefit from centralized control of resources. This article addresses some of the answers to these questions and the solutions available.
A 1997 study commissioned by Comdisco, reported that a slim twelve percent of companies have an effective asset management program in place for enterprise-wide or 'open' computing systemsii. According to 226 IBM Summit '96 survey respondents, less than one company in 10 - a mere 8% - has an Internet business recovery plan in placeiv. Nearly one in four of 300 companies surveyed by Comdisco felt that they were totally vulnerable in the event of a LAN disasterv . These are remarkable statistics, in spite of the ninety-one percent of organizations now employing local area networks (up from seventy-eight percent in 1993).
Let's glance at another startling statistic. An Enterprise Networking Journal study in 1995 highlighted that while most companies focus on headlines-grabbing natural disasters, a whopping fifty-one percent of business interruption incidents are caused by operator or user error.
Why is business continuity
lagging in TacticalComputing?
Traditional methods of disaster recovery and business resumption planning are trickling to the Tactical Computing realm, albeit slowly. We operate in a complex tactical computing world, in the methodology and locality of data we create, and particularly because of the exponential rate that new data is created. Enterprises of all sizes are placing new emphasis on understanding and valuing their 'knowledge warehouse'. Yet, data recovery is a distant concern until your corporate travel department cannot book your airline flight because the reservation system is down. When you run a small office and experience a disruption in your payroll system, getting payroll back online will eclipse your concern about the cost of performing timely backups. When the network is down and your field agents cannot settle insurance claims quickly, you reap the negative effect on corporate customer satisfaction ratings. Well now, hopefully you get the picture.
Why are traditional methods not being applied to the enterprise? Some of the reasons are quite simple: all our data used to be in one place; data creators formerly relied on the data managers to protect their data, (i.e., glass-house conventions). A paper trail always served as backup to the electronic system; or we could rely on the memory of a key individual in an isolated data center fortress to get us back in business. The difference is, now data creators with workstations and PC's on their desktops are responsible. Did you back up your data before you left the office for the weekend? Did you back up your data at any time in the last day? Week? Month? And if you didn't, and oh by the way, you are a professional and should know what to do, do you believe your field agents, customer services reps, the engineers, finance people backed up their data?
As the recognized and acknowledged value of data moves to the forefront of corporate thinking, the smart, 'Tactical Computing' professionals are taking a new look at an old problem ' disaster recovery ' and calling it, more appropriately, business continuity.
with the proliferation of
In the broader view, industry analysts suggest that executive awareness is a surprisingly strong motivator in the move to re-engineer business continuity. As we begin to rely on portable PC's while traveling and working at home, we feel more vulnerable. What if four hours of Sunday afternoon work on our laptop has disappeared on Monday morning?
Now that the computing platform for the company might be sitting on our desks rather than in the glass house, what happens when we have a failure? If a system or equipment failure causes so much grief on a micro level, what happens to the whole organization if the network goes down? It is amazing how frustrations at a relatively low level of the enterprise can open your mind to the high level of corporate risk.
Because of regulations such as the U.S. Foreign Corrupt Practices Act, businesses are reacting to the risks inherent in not maintaining or the inability to recover data. If you are an officer of a publicly traded company, you may be criminally liable if you do not safeguard the assets of your organization.
Data is considered a corporate asset. Given a fairly weak interpretation of the Foreign Corrupt Practices Act, corporate officers are held financially and legally obligated should the business fail to recover from a disaster quickly.
Regulatory agencies are also beginning to recognize issues surrounding privacy, security and data interchange. Partly because of the Internet boom, partly because of new applications based on new technology capabilities, controls and regulation of data access and data movement are ready to explode on our horizon. We must safeguard confidential information ( Medical images, credit card numbers, employment history, insurance records) from becoming common knowledge because of lack of controls. Do you recall the panic caused by recent disclosures that financial records of individuals were being made available on the Internet?
Re-engineering business continuity processes is no longer simply an MIS issue. We must assure the viability of the enterprise in the event of a disaster or disruption (or the case of a hacker making illegal use of information gained). As the data becomes distributed throughout the enterprise, every individual shares the responsibility to protect it.
Most organizations are looking for recovery closer to the actual point of failure. Best practices indicate that access to data at the point of the last backup is no longer acceptable; some business, like airlines reservations, and ATM machines require recovery with essentially no gap between failure and resumption.
Remember the user requirements list discussed earlier: 'We want critical data off-site; with one hundred percent data availability, one hundred percent data integrity; and elimination of manual processes.' These requirements from the IT community appear to be at odds with the lax approach of most Tactical Computing systems. Heightened awareness of system vulnerability and indications of more vigorous regulatory intervention is becoming a driving force.
More and more companies are addressing disaster recovery planning and business continuity for their distributed data by examining all their data, regardless of the physical location or the creating platform.
This is being done not only in the name of disaster recovery, but with a wider view of re-engineering storage management. The justification is based on the inherent value of the data and the inability of an enterprise to continue to function without its data.
Fortunately, technology advances make re-engineering of business continuity more practical than ever before. Key among them is the telecommunications transformation into an information superhighway, both on and off the Internet. As a result of Congress enacting the Telecommunication Reform Act of 1996, mergers of major telephone companies are abundant.
Cable operators are entering the lucrative telecommunications business. Private networks, dark fiber and SONET are enabling increased communications bandwidth. All these advances are driving increased competition and lower-cost transmission and communication lines.
High-capacity/high-speed tape and its partner, the robot-controlled library can be located away from the prime processing sites and logically connected over great distances. The price of these lines is no longer a deterrent to good data protection and storage management practices.
Our research indicates that, on average, costs for T1, T3 and SONET lines are coming down at a rate of 30 percent annually in the U.S. Our company experience was that the line charges were about $80,000 per month for T3 services installed in 1992. Today that equivalent line charge is about $1,000 a month.
Cost justification for the expense of business continuity is somewhat easier because of another technology advance. Dedicated telecommunication lines, one for voice, one for video, one for data are no longer required.
New telecommunication transport methodologies and packetized multiplexing methods of data transfer support a variety of signals on the same line. Recent technology advances in data concentrators and multiplexors allow users with different types of data distributed throughout their enterprises, created at different locations, to maximize their investment by sharing common telecommunications lines for various functions, sources and types of data.
Hardware compression has reached the high end of computing. T3 data compression, impossible until recently, is now enabling twice the effective data rate from forty-four megabits to over eighty-eight megabits across a single dedicated line!
Internet and intranet-based remote data storage and access shows the best promise for business continuity. Secure, off-site and data high-availability solutions are technologically ripe for the picking.
a tool for Tactical Computing
Tape is the ideal medium for Tactical Computing backup and recovery because it is fast, reliable and affordable. Until recently, tape backup did not offer the benefit of separation of data from the Tactical Computer. SCSI distances were limited to sixty feet or less. Thus, tapes still had to be manually carried off-site. Hot new channel extension technology makes direct off-site recording possible. Deemed Remote Electronic Vaulting, data extension capabilities allow data transfer across unlimited distances.
New technology available to Tactical Computing includes IBM 3490-format tape on SCSI remote channels, DLT (Digital Linear Tape) with mainframe-class speeds and reliability as well as helical recording formats.
Recent developments in automated tape libraries supporting virtually all computing platforms, solve the automation quandary. Automated libraries of all sizes abound, from the desktop version carrying a dozen or so tapes, to the huge systems holding thousands of tape cartridges.
Automation, combined with remotely connecting your tape system offers the brightest solution. No matter where you locate a robotic tape library, it can be a valuable asset in its role as the backup/recovery repository. Recovery can be accomplished remotely from an alternate site, enabling you to route data to your remote users quickly.
Software solutions abound
Software advances can take the pain and tedium out of business continuity planning. A myriad of software products has appeared over the horizon, just in time to take advantage of the hardware technology available to Tactical Computing.
Automated backup and recovery products occupy the forefront of software technology. Products like IBM's ADSM, StorageTek's Reel Librarian, Veritas' High Availability Suite and Oracle Backup enable automated remote backup and restore, even 'hot' database backup. Hierarchical Storage Management (HSM) products automatically migrate and recall little used files to and from tape. Even the smallest Windows 95 user can activate the Windows System Agent to perform automated backups.
Other products help with business continuity planning for large and small environments. Products such as Sungard's Comprehensive Business Recovery, Recovery Management's REXSYS, Comdisco's ComPAS and Disaster Recovery Services' Disaster Recovery 2000® for Windows offer the Tactical Computer great benefits. (See the PC Based Software Survey, DRJ, Volume 9, Issue 4, Fall 1996)
There is a better way
Here are ten practical steps to help assure your Tactical Computing business continuity success:
- Organize your data files, they constitute the 'unreplaceable', then back them up. Follow the old Chicago adage: 'Back up early and back up often'.
- Send your backups off-site, label them and place where you can find them.
- Select business recovery options in case your workplace becomes unavailable, alternate site and equipment. (See Alternate Site Survey article, Disaster Recovery Journal, Volume 9, Issue 3, Summer 1996).
- Purchase automated backup hardware, software and network facilities if you can. If you cannot, then get to know your company's data administrator. Ask him/her to explain the backup and recovery methodology. Make sure you are comfortable with it.
- Automate your backup and recovery with the equipment and services you purchased.
- Test your ability to recovery standard files as well as databases.
- Become involved in your local chapter of a business continuity organization like the Association of Contingency Planners. (See the list of Contingency Planning Group Contacts, a regular feature of the Disaster Recovery Journal)
- Know where to find resources, (see Tari Schreider's excellent article, White Paper: The Internet - Disaster Recovery Issues & Answers ). Subscribe to trade journals like the Disaster Recovery Journal and The ACP Sentinel.
- Ask the experts. Contact business recovery providers to get their ideas on best practices. Find vendors in the Vendor Directory on the DRJ home page: www.drj.com.
- Share your knowledge and experience with others.
Tactical Computing will continue its phenomenal growth. It offers great advantages to its users, but new challenges abound. The proliferation of computing platforms and the amount of data created on these tactical systems place great demands on backup and recovery cycles. Users can exploit high-capacity and high-performance communication networks, storage devices and software capable of supporting multiple, diverse operating systems.
When you have finally taken the proper steps to assure Tactical Computing business continuity, then you can expect your boss to follow King Solomon's advice. 'Do not withhold good from those who deserve it, when it is in your power to act.', Proverbs 3:27, NIVi..
Fred Aylstock is an Executive Consultant with StorageTek's Solutions Business Group ( Data Availability Services). During his twenty-eight year career in data processing, he has worked in many sectors of computing, including: systems engineering, applications, technical support, support management, hardware engineering, data availability and professional services.
Mr. Settergren is currently a Portfolio Manager for Storage Technology's Solutions Business Group. He has a Bachelor of Science degree in computer science and was formerly a Certified Information Systems Auditor (CISA).