
NETWORK-BASED STORAGE PROVIDES RECOVERY CAPABILITIES CLOSER TO POINT OF FAILURE
By Jim Morin
Until recently, companies have been caught in a risky Catch-22 in safeguarding their data. A large bank,
for example, requires a high level of protection for its massive volumes of data, meaning it has to move
gigabytes of data to storage every day. But getting the data to storage quickly is just the first part of the
equation. Next, they need to get it back. The more data, the longer it takes to recover; the longer the
recovery time, the less real protection the company has from down time and lost business. In other
words, the greater their need, the greater their risk.
Fortunately, new networking and storage technologies are enabling information users to formulate new
strategies that compress the time needed to back up and then, recover data. Network-based storage
provides the performance to move massive amounts of enterprise data over long distances within
narrow time windows and the network access to get at the data from different sites. Companies can
ensure that their backup data is as close as possible to the time of failure. If (or more likely, when) a
failure occurs, it accelerates the recovery process.
THE LONG AND SHORT OF LOCAL STORAGE STRATEGIES
Network-based storage is essentially an extension of traditional local storage techniques. The
advantages of the former are clear when you consider the limitations of the latter.
Local storage techniques have traditionally relied on duplexing or duplicating data in two forms using
combinations of disk and tape technology. Should the primary data not be available, the organization
initiates the recovery process using the duplex copy.
Applying this strategy for mainframe Direct Access Storage Devices (DASD) involves using the
capabilities built into the hardware of DASD controllers to manage the duplexing. These devices are
capable of writing data to multiple disk drives at the same time. The advantage is that should one disk
fail, the controller can automatically switch to the other disk with no interruption in service. However,
both disk volumes are attached to the same controller hardware, meaning that the controller is now a
single point of failure. Also, cable restrictions limit the DASD to be within 200 feet of the controller. To
overcome this, some host software applications and operating systems have been designed to ensure
greater protection by writing duplicate data onto different DASD systems, thereby achieving greater
hardware redundancy. Although this approach requires the duplexing logic be written directly into the
application software, it offers the advantage of being suitable for any model disk controller.
Disk storage is an expensive media, however, and most organizations quickly move all but essential
library management data to tape. Batch processes initiate storage management of files from DASD to
tape systems, either stand-alone or fully automated tape libraries. More sophisticated systems journal
critical files to tape at the same time the DASD is updated. Should the firm need to recover the data, the
journaled files are immediately available, which shortens the overall recovery time.
By nature, local-duplexed approaches have limitations. Since both the duplex to DASD and duplex to
tape approaches assume the devices are co-located in the same facility with the host, these techniques
only afford protection for the storage device itself, not the data center as a whole. It does not protect
the firm from a disaster such as a flood that affects the entire data center.
THE STATE-OF-THE-ART INFORMATION HIGHWAY
To protect themselves from such disasters, most organizations simply run a back-up copy of their disk
data to a tape drive and store the tapes off site.
Sometimes called CTAM or Chevy Truck Access Method (because the tapes are transported by truck),
the offsite warehouse builds an inventory of backup data. If a failure of the main processor occurs, the
hot site takes over using the tape backup copy.
Since data centers deal with such large volumes of data, the number of tapes generated and moved by
such back-up procedures can be very large, often into the hundreds or even thousands per day. While
CTAM effectively moves data off-site, it limits the frequency and speed at which data can be moved to
a remote site and seriously impacts the time to recover.
Recovering from a disaster in this fashion could take a long period of time, up to a week or longer
depending on the level of service contracted with the hot site provider. This is because off-site backup
tapes would need to be physically sent to the host recovery facility (hot site).
Managing the process of manually cataloging, moving and storing these tapes is a tremendous task
which is vulnerable to human error and transportation problems.
In addition, many organizations that have actually experienced a disaster situation have found that
retrieving the tapes and transporting them to the backup data center can be dangerous, impractical or
even impossible in a disaster situation.
NETWORK-BASED
STORAGE APPLICATIONS
Network-based storage takes a true enterprise-oriented approach to improving on state-of-the-art remote
storage. Network-based storage automates the process of moving data off-site and provides access to
the enterprise data anywhere over a wide area network, without sacrificing performance.
From the perspective of the storage devices (e.g. DASD, tape, optical, microfiche), the processing
power (i.e., MIPs) to control them are virtually located anywhere out on the network.
Todays applications for network-based storage involve different techniques for disaster protection and
recovery as well as improved access to information over the network.
Each technique or application involves different elements of software, networking and peripherals,
depending on the individual requirement and budget.
Managing the process of manually cataloging, moving and storing these tapes is a tremendous task
which is vulnerable to human error and transportation problems.
NETWORK-BASED STORAGE CONFIGURATIONS
Recovery in Hours: Remote Duplex To Tape (Electronic Tape Vaulting)
One way to overcome the shortcomings of a local-duplex strategy is to move the back-up data to a
remote site using a combination of products and services:
1. High-speed transmission services like T1 or DS-3 circuits or fiber optic links,
2. Robotic tape libraries such as the StorageTek 4400 Automated Cartridge System or the
Memorex-Telex 5400 Automated Tape Library,
3. Channel networking systems.
Often referred to as electronic tape vaulting, this strategy provides high-speed unlimited distance
storage of back-up data without the need to manually handle tape cartridges.
Although many companies use electronic vaulting solely as a means of moving data from a single data
center to an off-site storage vault, much can be gained by connecting multiple data centers to the
storage site.
One clear benefit is that the fixed costs of the vault are shared by multiple sites, providing a cost savings
for the organization. But more importantly for recoverability, connecting multiple sites to the vault
enables the organization to restore data from a failed data center to any other data center on the
network.
One catch: recovering data to an alternate site requires that the alternate site know which tapes to
restore. This information is managed by software running on the host, called the Library Management
System, that maps data set names to specific tapes in the library and stores this information on DASD.
Sharing an electronic vault among multiple sites requires that all sites have access to the library
management data.
Fortunately, recent advances in channel networking technology have rendered this little more than a
network configuration issue. In 1992, leading channel networking systems gained the ability to provide
unlimited distance connectivity to DASD in much the same manner as the tape library itself.
With this capability, hosts located anywhere on the network have gained access to the DASD storing
the library management data so long as the DASD is also located on the network.
Example: Utility Company
For a utility company serving a major metropolitan area, manually transferring tapes to remote storage
was an unacceptable recovery process in a disaster situation.
To protect their business and their customers, the company set a one-hour recovery goal. For their
purposes, this means maintaining back-up application systems, as well as remote duplex storage to tape.
All three data centers mutually back up each other. This involves synchronizing the electronic transfer of
huge volumes of tapes (200-400 tapes per day) over high-speed DS-3 trunks.
Because of the tremendous amount of data involved and the high cost of the links, some important
networking capabilities come into play. Load leveling allows the company to balance the traffic levels on
the links to ensure that one link isnt bottle necked while another is under utilized.
In addition to networked storage access, the network provides alternate path routing to ensure the data
gets to its destination.
Continuous availability is achieved by dynamically switching to any data center through the resilient
channel network.
RECOVERY IN MINUTES: Remote Duplex To DASD
One drawback of relying on tape back-ups is that they must be restored to a disk, a process which
typically comprises the primary component of the time-to-recover.
Another strategy accelerates the process of restoring data by providing back-up of key data sets
directly to redundant disk drives via channel-to-channel links.
Remote duplex via channel-to-channel contrasts with the strategies discussed above in that the
communication takes place between two hosts rather than between a host and a peripheral (tape drive).
The advantage is that host-to-host operations are inherently faster than host-to-tape operations. This
strategy also reduces the organizations time-to-recover since the intermediate step of restoring tapes to
disks is eliminated.
One approach uses channel-to-channel capabilities to duplicate entire data volumes at an alternate data
center. This can be accomplished using a batch-oriented process of copying entire files to the alternate
site using file-transfer software. At the remote site, the alternate host receives the data and copies it to
the local disk drives, creating a complete copy of the original data.
Another variation uses journaling or shadowing capabilities of application or database management
software to keep track of the changes to a database in real-time.
Since updates are applied to both the local and remote copies of the database, the processing workload
can smoothly shift to the remote site. A key design feature of these products is to establish a logical
quiesce or point-in-time to synchronize the database to a recovery point. The recovery procedure
can now take place in hours (for journaled files), or could be designed for non-disruptive switching, if
the data, transactions, response time and connectivity are all maintained, for total transparency to the
user.
Especially in large data volume applications, channel networking software accelerates channel-to-channel
file transfer operations further by using techniques such as load-leveling over multiple channels at the
same time. CHANNELspeed also enables transfer across SNI boundaries without requiring an NCP
gateway through the Front End Processors.
REMOTE DASD WITHOUT A REMOTE HOST
Customers whose data center configurations already include hosts in multiple locations can use the
strategies described above to better leverage their investment. However, adding a host to the remote
location specifically for duplex applications can be cost-prohibitive.
Customers who do not have hosts in multiple locations can still perform remote duplexing without
adding host processors.
The remote duplex can be either batch oriented using a simple procedure to copy a file from one disk to
another or it can be real-time using the duplexing capabilities of the disk controller or the application
software.
Much has been said and written about the ability of information technology to provide organizations
with a competitive advantage. This is now matter of fact.
Businesses depend so much on their information technology that they cannot afford to lose even a few
hours worth of information. Nor can they afford to be limited in their ability to access that information
quickly.
Jim Morin is the Manager of Systems Marketing for Computer Network Technology Corporation in
Minneapolis, MN.
This article adapted from Vol. 6 #3.
DR World Main Index | Return to DRJ's Homepage
Disaster Recovery Worldİ 1999, and Disaster Recovery Journalİ
1999, are copyrighted by Systems Support, Inc. All rights reserved. Reproduction
in whole or part is prohibited without the express written permission form
Systems Support, Inc.