Spring World 2018

Conference & Exhibit

Attend The #1 BC/DR Event!

Winter Journal

Volume 30, Issue 4

Full Contents Now Available!

Friday, 26 October 2007 04:19

Closer To Point of Failure

Written by  Jim Morin
Rate this item
(0 votes)

Until recently, companies have been caught in a risky Catch-22 in safeguarding their data. A large bank, for example, requires a high level of protection for its massive volumes of data, meaning it has to move gigabytes of data to storage every day. But getting the data to storage quickly is just the first part of the equation. Next, they need to get it back. The more data, the longer it takes to recover; the longer the recovery time, the less real protection the company has from down time and lost business. In other words, the greater their need, the greater their risk.

Fortunately, new networking and storage technologies are enabling information users to formulate new strategies that compress the time needed to back up and then, recover data. Network-based storage provides the performance to move massive amounts of enterprise data over long distances within narrow time windows and the network access to get at the data from different sites. Companies can ensure that their backup data is as close as possible to the time of failure. If (or more likely, when) a failure occurs, it accelerates the recovery process.


Network-based storage is essentially an extension of traditional local storage techniques. The advantages of the former are clear when you consider the limitations of the latter.

Local storage techniques have traditionally relied on “duplexing” or duplicating data in two forms using combinations of disk and tape technology. Should the primary data not be available, the organization initiates the recovery process using the duplex copy.

Applying this strategy for mainframe Direct Access Storage Devices (DASD) involves using the capabilities built into the hardware of DASD controllers to manage the duplexing. These devices are capable of writing data to multiple disk drives at the same time. The advantage is that should one disk fail, the controller can automatically switch to the other disk with no interruption in service. However, both disk volumes are attached to the same controller hardware, meaning that the controller is now a single point of failure. Also, cable restrictions limit the DASD to be within 200 feet of the controller. To overcome this, some host software applications and operating systems have been designed to ensure greater protection by writing duplicate data onto different DASD systems, thereby achieving greater hardware redundancy. Although this approach requires the duplexing logic be written directly into the application software, it offers the advantage of being suitable for any model disk controller.

Disk storage is an expensive media, however, and most organizations quickly move all but essential library management data to tape. Batch processes initiate storage management of files from DASD to tape systems, either stand-alone or fully automated tape libraries. More sophisticated systems journal critical files to tape at the same time the DASD is updated. Should the firm need to recover the data, the journaled files are immediately available, which shortens the overall recovery time.

By nature, local-duplexed approaches have limitations. Since both the duplex to DASD and duplex to tape approaches assume the devices are co-located in the same facility with the host, these techniques only afford protection for the storage device itself, not the data center as a whole. It does not protect the firm from a disaster — such as a flood — that affects the entire data center.


To protect themselves from such disasters, most organizations simply run a back-up copy of their disk data to a tape drive and store the tapes off site.

Sometimes called CTAM or Chevy Truck Access Method (because the tapes are transported by truck), the offsite warehouse builds an inventory of backup data. If a failure of the main processor occurs, the “hot site” takes over using the tape backup copy.

Since data centers deal with such large volumes of data, the number of tapes generated and moved by such back-up procedures can be very large, often into the hundreds or even thousands per day. While CTAM effectively moves data off-site, it limits the frequency and speed at which data can be moved to a remote site and seriously impacts the time to recover.

Recovering from a disaster in this fashion could take a long period of time, up to a week or longer depending on the level of service contracted with the hot site provider. This is because off-site backup tapes would need to be physically sent to the host recovery facility (hot site).

Managing the process of manually cataloging, moving and storing these tapes is a tremendous task which is vulnerable to human error and transportation problems.

In addition, many organizations that have actually experienced a disaster situation have found that retrieving the tapes and transporting them to the backup data center can be dangerous, impractical or even impossible in a disaster situation.


Network-based storage takes a true enterprise-oriented approach to improving on state-of-the-art remote storage. Network-based storage automates the process of moving data off-site and provides access to the enterprise data — anywhere over a wide area network, without sacrificing performance.

From the perspective of the storage devices (e.g. DASD, tape, optical, microfiche), the processing power (i.e., MIPs) to control them are virtually located anywhere out on the network.

Today’s applications for network-based storage involve different techniques for disaster protection and recovery as well as improved access to information over the network.

Each technique or application involves different elements of software, networking and peripherals, depending on the individual requirement and budget.

Managing the process of manually cataloging, moving and storing these tapes is a tremendous task which is vulnerable to human error and transportation problems.


Recovery in Hours: Remote Duplex To Tape (Electronic Tape Vaulting)
One way to overcome the shortcomings of a local-duplex strategy is to move the back-up data to a remote site using a combination of products and services:

1. High-speed transmission services like T1 or DS-3 circuits or fiber optic links,
2. Robotic tape libraries such as the StorageTek 4400 Automated Cartridge System or the Memorex-Telex 5400 Automated Tape Library,
3. Channel networking systems.

Often referred to as “electronic tape vaulting,” this strategy provides high-speed unlimited distance storage of back-up data without the need to manually handle tape cartridges.

Although many companies use electronic vaulting solely as a means of moving data from a single data center to an off-site storage vault, much can be gained by connecting multiple data centers to the storage site.

One clear benefit is that the fixed costs of the vault are shared by multiple sites, providing a cost savings for the organization. But more importantly for recoverability, connecting multiple sites to the vault enables the organization to restore data from a failed data center to any other data center on the network.

One catch: recovering data to an alternate site requires that the alternate site know which tapes to restore. This information is managed by software running on the host, called the Library Management System, that maps data set names to specific tapes in the library and stores this information on DASD. Sharing an electronic vault among multiple sites requires that all sites have access to the library management data.

Fortunately, recent advances in channel networking technology have rendered this little more than a network configuration issue. In 1992, leading channel networking systems gained the ability to provide unlimited distance connectivity to DASD in much the same manner as the tape library itself.

With this capability, hosts located anywhere on the network have gained access to the DASD storing the library management data so long as the DASD is also located on the network.

Example: Utility Company

For a utility company serving a major metropolitan area, manually transferring tapes to remote storage was an unacceptable recovery process in a disaster situation.

To protect their business and their customers, the company set a one-hour recovery goal. For their purposes, this means maintaining back-up application systems, as well as remote duplex storage to tape. All three data centers mutually back up each other. This involves synchronizing the electronic transfer of huge volumes of tapes (200-400 tapes per day) over high-speed DS-3 trunks.

Because of the tremendous amount of data involved and the high cost of the links, some important networking capabilities come into play. Load leveling allows the company to balance the traffic levels on the links to ensure that one link isn’t bottle necked while another is under utilized.

In addition to networked storage access, the network provides alternate path routing to ensure the data gets to its destination.

Continuous availability is achieved by dynamically switching to any data center through the resilient channel network.


One drawback of relying on tape back-ups is that they must be restored to a disk, a process which typically comprises the primary component of the time-to-recover.

Another strategy accelerates the process of restoring data by providing back-up of key data sets directly to redundant disk drives via channel-to-channel links.

Remote duplex via channel-to-channel contrasts with the strategies discussed above in that the communication takes place between two hosts rather than between a host and a peripheral (tape drive).

The advantage is that host-to-host operations are inherently faster than host-to-tape operations. This strategy also reduces the organization’s time-to-recover since the intermediate step of restoring tapes to disks is eliminated.

One approach uses channel-to-channel capabilities to duplicate entire data volumes at an alternate data center. This can be accomplished using a batch-oriented process of copying entire files to the alternate site using file-transfer software. At the remote site, the alternate host receives the data and copies it to the local disk drives, creating a complete copy of the original data.

Another variation uses journaling or shadowing capabilities of application or database management software to keep track of the changes to a database in real-time.

Since updates are applied to both the local and remote copies of the database, the processing workload can smoothly shift to the remote site. A key design feature of these products is to establish a “logical quiesce” or “point-in-time” to synchronize the database to a recovery point. The recovery procedure can now take place in hours (for journaled files), or could be designed for non-disruptive switching, if the data, transactions, response time and connectivity are all maintained, for total transparency to the user.

Especially in large data volume applications, channel networking software accelerates channel-to-channel file transfer operations further by using techniques such as load-leveling over multiple channels at the same time. CHANNELspeed also enables transfer across SNI boundaries without requiring an NCP gateway through the Front End Processors.


Customers whose data center configurations already include hosts in multiple locations can use the strategies described above to better leverage their investment. However, adding a host to the remote location specifically for duplex applications can be cost-prohibitive.

Customers who do not have hosts in multiple locations can still perform remote duplexing without adding host processors.

The remote duplex can be either batch oriented using a simple procedure to copy a file from one disk to another or it can be real-time using the duplexing capabilities of the disk controller or the application software.

Much has been said and written about the ability of information technology to provide organizations with a competitive advantage. This is now matter of fact.

Businesses depend so much on their information technology that they cannot afford to lose even a few hours worth of information. Nor can they afford to be limited in their ability to access that information quickly.

Jim Morin is the Manager of Systems Marketing for Computer Network Technology Corporation in Minneapolis, MN.

This article adapted from Vol. 6 #3.

Read 2515 times Last modified on Thursday, 11 October 2012 08:18