Spring World 2015

Conference & Exhibit

Attend The #1 BC/DR Event!

Fall Journal

Volume 27, Issue 4

Full Contents Now Available!

October 26, 2007

ABARS: Solving Todays Recovery Challenges

Written by 
Rate this item
(0 votes)

Will the recovery really work? The issues related to answering that question are indeed mind-boggling. Providing fast, complete backup and recovery in today’s 24X7 processing environments for both local and disaster outage scenarios across multiple applications and data types is of paramount importance. Downtime windows of 8-12 hours each weekend to create full-volume dumps are becoming unworkable. Incremental backup failures each night are becoming more common. The use of Aggregate Backup and Recovery Support (ABARS) will provide the necessary function required to address these concerns.

Many kinds of backup tools are currently used in today’s environment, each intended to address different recovery situations. Volume dumps are intended to protect against HDA failure, incremental backups protect against single data set loss, IMAGECOPIES are needed for online databases, etc.

Although most installations focus on the backup process, the real issue is RECOVERY. Recovery must be cost-effective, streamlined, complete and all-encompassing. Enormous overhead is spent backing up data redundantly with multiple tools and still many recovery requirements can’t be met. Problems traditionally include missing data, incompatible device geometries, data/catalog synchronization issues, huge manual effort and unacceptably long recovery times.

Additionally, implementing DFSMS and its related strategies requires that old, ‘tried and true’ backup processes be re-examined, especially for disaster recovery. Previously, ‘critical’ data was hand placed on certain DASD and dumped for backup purposes. DFSMS, if fully implemented and exploited correctly, completely removes physical device dependencies, with data now existing anywhere in a hierarchy. SMS Managed Tape further complicates this issue. Aggressive migration policies cause volume dumps to miss critical data. Multi-volume data sets cause additional complications. Volume dumps may get a nice return code zero (0) at backup time, but have inherent problems during the recovery.

Identifying critical application data isn’t easy. Extremely long, complex JCL streams, legacy systems, databases and the ‘fluid’ nature of some applications make manually defining and maintaining a complete selection list virtually impossible. Many files are shared across multiple applications or online systems. What are the critical data sets? Where does the synchronization point occur? How would the data be recovered?

In order to solve the above issues, a completely new type of recovery methodology is required. One that eliminates redundant backup copies, ensures all required data is available, makes hardware geometries ‘transparent’, provides synchronization, works in all vaulting configurations and provides a quick, automated recovery process. ABARS provides this functionality.

Volume Dumps

Volume dumps are typically taken for two different purposes. First, system volume dumps are done for backup of nucleus data, to provide system recovery with ‘stand-alone’ techniques. Secondly, ‘application’ volume dumps are taken to protect against hardware failures or site outages. These volumes could also contain program products or TSO type data. The value of these dumps is somewhat questionable, due to the non-static nature of the data. While it may be better than no backup at all, quite a number of applications will be unable to use the majority of data recovered from dumps. Dumps become less usable as the amount of multi-volume data sets increase, since the ‘pieces’ of the data set that lie on each volume could be out of synchronization with the other pieces.

Synchronization

Synchronization requirements permeate the entire backup/recovery process and need to be examined for all backup tools. Synchronization ensures ALL components required for processing are usable. These include the actual data itself, application program product libraries, catalog entries and structure, pre-allocated libraries, DFSMShsm (Data Facility Storage Management Subsystem Hierarchical Storage Manager) Control Data Set entries, RACF, Tape Management entries, and so on. Without all of the above components being synchronized, correct results from application processing are in jeopardy.

Historically, synchronization was easy to accomplish. Installations ‘shut down’ for some number of hours each day or week in order to do any maintenance or cleanup required. If all volumes were dumped during this period of time, synchronization was guaranteed since no other activity was in process.

Today, total system synchronization is virtually impossible. Many installations must process 24 hours a day, 7 days a week. Downtime is no longer available. The current use of backup ‘windows’ is effectively useless, since application processing has overflowed into the window. No synchronization, as defined above, occurs. Also, the large amount of data required causes the backup window to be continually missed. ABARS doesn’t require system downtime windows, rather, it is performed at each given application logical synch point.

Using Volume Dumps and Incremental Backups For Volume Forward Recovery

Incremental backups are intended to reduce the amount of data backed up on a daily basis and are taken if the data has been modified. Volume dumps are used to ‘shorten’ the amount of time required to recover the majority of DASD data. Incrementals can be ‘applied’ to provide for volume ‘forward recovery’. While this addresses most storage administration concerns regarding data recovery, many applications can’t process since this data is now ‘out of synch’ with other application data. An application could consist of 100 data sets. If 99 data sets are successfully copied, but one is missed (for any reason), the application recovery is in question. The application usually needs to apply their own backups over the data already recovered by the above full volume process. This, in effect, requires three different recoveries. It could be argued that the first two recoveries (the volume recovery and the APPLY INCREMENTAL) are of little value since they could be ‘overlayed’ by the application recovery anyway. ABARS will ensure application data backup/recovery synchronization.

Recovery of DFSMShsm Migrated Data

Without ABARS, migrated data recovery is extremely difficult. A long, cumbersome and error prone process is the only available mechanism for recovering this data. As such, many installations are not incented to aggressively migrate critical, production data. They tend to leave it on costly DASD to ensure it can be backed up via the volume dump process. The dollar cost for this process is difficult to determine, but could total millions of dollars for a medium size (5 terabytes) installation. This cost will vary depending on individual migration criteria and amount of data that is considered critical. ABARS solves this problem, since support for migration data backup/recovery is provided.

Analysis of Current Backup/Recovery Costs

How much processing time and system resources are spent with incremental backups,volume dumps, geners, repros, image copies, etc? How many tapes are needed to contain all of this data? With DFSMShsm, what amount of time, resources and effort is spent providing backup tape RECYCLE? A better alternative would be to eliminate the RECYCLE process for backup tapes altogether. ABARS tapes do not require RECYCLE.

What personnel staff resources are required to support the backup/recovery process? Are volume dumps and incrementals monitored daily, with any errors corrected immediately? If not, then complete forward recovery is suspect. What if the data set that backup failed on needs to be recovered?

Summary of Volume Dump And Incremental Backup Issues

While these backups were the industry standard for years, they have inherent flaws and shortcomings that make them undesirable in today’s environments. It should also be noted that these tools address only DASD data. These tools do not protect against loss of data that is on tape (or other media) or is currently migrated or archived. Backup and recovery methodologies must evolve from a physical, volume oriented process to one that is logical in nature.

Disaster Recovery

Unique requirements exist for disaster recovery (DR). While these requirements will vary depending on unique installation configurations and business resumption requirements, most DR systems have a similar ‘core’ of requirements.

DR backups are intended to be shipped offsite, to ensure they are not affected in the case of a site outage. If the DR backup could also be used for local recovery purposes, the overall cost of availability management has been drastically reduced. Either two copies of the DR backup need to be made (one to stay onsite, one to be vaulted), or an ‘Electronic Vault’ configuration should be investigated.

Addressing Today’s Recovery Challenges

To address the above issues, the recovery tool of choice needs to:

  • Provide required recovery in a true SMS configuration
  • Significantly reduce or eliminate backup ‘downtime windows’
  • Ensure application data & catalog synchronization
  • Address device geometry differences
  • Provide for recovery of more than just DASD data
  • Ensure easy DFSMShsm migrated data recovery
  • Exploit Electronic Vaults
  • Significantly reduce the time required to provide for complete site recovery
  • Reduce/eliminate RECYCLE overhead for backup tapes
  • Provide an easy method to move applications or data centers
  • Reduce the amount of technical knowledge and decision making required by personnel staff during recovery

The ABARS Solution

ABARS provides the solution to many of the above challenges. ABARS is provided as part of the IBM DFSMShsm product, and is specifically designed to address the issue of logical application disaster recovery. It has the ability to solve synchronization issues, handles geometry differences, addresses quick application recovery and fits well into the DFSMS world. ABARS is not used for the actual system recovery. This is best achieved today with a stand-alone restore of the system volumes. DFSMShsm must be available to recover with ABARS. Regarding DFSMS, only the Address Space is required at the primary site. The actual application data to be ABARS managed does not need to be SMS managed.

Application Synchronization

ABARS is best run as a step within an application job stream, at a logical synch point determined by application analysts, or automatically via a third party product. It can also be used to backup database files, and easily exploits hardware function, such as Concurrent Copy. The application analyst determines data set selection ‘filters’ required. Where the data resides, either on DASD, Migrated (ML1 or ML2) or on Tape, is no longer important, however it must be cataloged. Applications may require several ‘synch points’. Synchronization is guaranteed with ABARS. Any data set that is missed, for whatever reason, causes the entire ABARS backup to fail.

The benefit of this concept becomes obvious during the recovery process. All of the data backed up at the logical synch point is restored, and the application can begin running at the next job step. There is no need to coordinate the recovery between multiple backup utilities, or data backed up at different times.

Catalog Synchronization

Tedious catalog ‘scrubbing’ is no longer required. ABARS synchronizes the catalog during the recovery process. This ensures catalogs accurately reflect the data you have available at any time. ABARS can rebuild entire user catalog structures. GDG bases will be recovered if necessary, and GDG data sets connected automatically.

The Multi-Task Recovery Process

ABARS provides multi-task recovery to ensure a large number of applications can be recovered very quickly. With proper analysis of application priorities, critical application processing can begin quickly after an outage. It is important to note that ABARS allows multiple, simultaneous recoveries, since normal DFSMShsm 1.2 (and earlier) recovery is single threaded. It can take an enormous amount of time to recover an entire site using incremental backups.

Selecting Data To Backup With ABARS

The INCLUDE/EXCLUDE parameters are very familiar. Several new parameters, ALLOCATE and ACCOMPANY may not be so obvious.

If data sets must be pre-allocated, empty, and cataloged, then ABARS copies their attributes, however, no actual data will be backed up. During a recovery, this data set will be allocated and cataloged using those attributes. Tapes created with other utilities or programs can be specified in the ACCOMPANY list. The actual tape is not mounted and read, just included in the list of tapes in the aggregate.

Data is backed up directly from any level of the hierarchy (migrated data does not need to be recalled before being backed up).

Output Files Created by ABARS

The four major files created by ABARS are the ‘C’ File (Control File), ‘I’ File (Instruction File), ’D’ File (Data File), and ‘O’ File (Offline File). The names of these files are determined by a combination of a data set name prefix that you specify, and a fixed, dynamically created suffix. With ABARS II the suffix now has a ‘GDG like’ name, indicating the version and copy number. An example of a ‘C’ File name would be SIS.ABAR.C.C01V0001. The ‘C’ File contains information that ABARS requires to perform the application data recovery, and can be specified on the ABARS recovery command.

The ABARS Activity Log

ABARS creates multiple activity logs, separate from other logs for migration, backup, dump, recycle and commands. ABARS logs contain information regarding the success or failure of the aggregate process.

The activity log must be examined to determine the status of the ABARS process. If there were any errors, what are they and how will they be corrected? And finally, information required for the recovery process will need to be verified.

Multiple ABARS Output Copies

For installations that wish to perform the backup once, but create two output tape copies (one to remain on-site, the other vaulted). ABARS II has the ability to create up to 15 tape copies at once.

The copies are tracked via the copy number in the file names. In the name SIS.ABAR.C.C01V0001 mentioned previously, the C01 is tracking the first copy of the aggregate backup. Copy number 2 would be tracked via the name SIS.ABAR.C.C02V0001.

Single Data Set Recovery Using ABARS

To consider the option of ABARS replacing incremental backup, ABARS would need the ability to recover a single data set from the aggregate. Neither ABARS I or II has this ability. A third party product easily provides this facility for Primary, Migrated or Tape data.

If replacement of incremental backup with ABARS is undertaken, DFSMShsm should then be setup to bypass the data set during the incremental backup window. This can be accomplished either through a DFSMS MGMTCLAS of no autobackup, or by placing the data set through the ACS Routines into a Storage Group that is not autobacked-up.

Savings With ABARS

Several areas of savings are also possible. Combining ABARS with a third party product allows ABARS to take on additional function, such as single data set restore, thereby replacing incremental backups. This will drastically reduce the amount of redundant backups currently being taken, not to mention the amount of CPU, data transfer, tape cartridges, etc., that will be saved. The above mentioned savings would be increased significantly.

Properly implemented, ABARS provides easy, streamlined site recovery, provides data and catalog synchronization, different hardware device geometry support and drastically reduced outage times. ABARS application oriented logical recovery and "multi-tasking" provides the ability to start running important, business critical applications quickly in the event of a site outage. ABARS was designed to work well in today's 24X7, fully SMS environment. ABARS functionality can be easily expanded into, also providing local recovery needs, resulting in significant daily savings over current, traditional tools.


Brad Bruhahn is the vice president of Technologies Services for Software Informaiton Services (Mainstar).

Read 1942 times Last modified on October 11, 2012