|
DISASTER
RECOVERY
JOURNAL
P. O. Box 510110
St. Louis, MO 63151
(314) 894-0276
Fax: (314) 894-7474
Internet
www.drj.com
E-mail drj@drj.com
PUBLISHER &
EDITOR-IN-CHIEF
Richard L. Arnold, CBCP
richard@drj.com
SENIOR EDITOR
Janette Ballman
janette@drj.com
MANAGING EDITOR
Jon Seals
jon@drj.com
COPY EDITORS
Richard Sandhofer
richards@drj.com
Pamela Clifton
pamelaclifton@hotmail.com
ADVERTISING
Robert Arnold
bob@drj.com
_____________
Corporate
President/CEO
Richard L. Arnold, CBCP
richard@drj.com
Vice
President
Robert Arnold
bob@drj.com
CONFERENCE COORDINATOR
Patti Fitzgerald, CBCP
patti@drj.com
CONFERENCE REGISTRAR
Merce Knese
mercedes@drj.com
CIRCULATION
Laura Baugh
laurab@drj.com
EXECUTIVE
COUNCIL
Patrick Corcoran, IBM Bus. Cont. & Rec. Services
Jeff Dato, MBCP, KPMG
Edward S. Devlin, E.S. Devlin & Associates
Judith Eckles, SunGard Availability Services
James Hammill, CBCP, JMH Consulting Inc.
John Jackson, Independant
INTERNATIONAL
CONTACTS
England: Thom Hetherington
Business Continuity
Phone: 0161-237-1007
thomh@tempus.demon.co.uk
Australia: Anthony J. Harvey
Journal of Business Continuity
Phone: 0011-613-953-0055-8
fax: 0011-613-953-0528
sector@notability.com.au
Japan: Shinji Hosotsubo
Quake Japan Co., Ltd.
Phone: 03-3215-2880
fax: 03-3215-2881
Brazil:
Jose Carlos Ferreira
Disaster Recovery Mercosul
Phone: 55
11 3666-9506
conc2000@uol.com.br
www.drms.com.br
|
|
Click
Here for a Printable Version
DATA
RECOVERY
Synchronicity
or Not
By JEFF BLACKMON
In today’s data processing
environment, synchronicity of data may need to be addressed when performing
backups as well as the recovery of data. The subject of data synchronicity
was not a big topic of discussion when data centers were composed of
a single large mainframe. The default way of doing backups ensured that
all the data was in sync. But with the transition to the distributed
environment of multiple UNIX and NT servers, synchronicity of data may
be an issue that needs to be considered. This can be the difference
between a successful business recovery and just having a large collection
of miscellaneous data.
In the past, storage administrators usually did not have to worry about
the data synchronicity issues when performing backups in the mainframe
environment. The accepted method for doing data backups was by the full
volume dump method. Usually on a Saturday night or early Sunday morning,
all applications were halted and then multiple full volume disk backups
were submitted. Data was static. This went on for a given number of
hours until the last full volume backup was complete. At no time during
the backups were application programs up and running. This style of
backup scenario produced a full set of recovery tapes that could be
used at a hotsite or elsewhere for data center recovery, and also produced
a recovered system that had all the data in sync. In effect, a snapshot
in time of the entire system had been collected and stored onto tape
or cartridge.
In today’s processing environments, backup windows, requirements,
processes, procedures, and equipment have drastically changed since
the days of stand-alone mainframe environments. Now, large data centers
may have mainframes attached to many UNIX servers, which in turn may
be connected to a multitude of NT Servers. All of the NT Servers may
possibly feed information back into a data warehouse system that resides
on the mainframe. A single transaction coming into a data center probably
will reside on multiple platforms at different points in time. This
type of data center architecture presents a multitude of problems for
backup and recovery processing. Storage management no longer has the
luxury of a large backup window to complete a snapshot of all the data.
Many, if not all shops are heading toward a 24x7 processing schedule.
Added to all the confusion are the different backup methodologies that
are used today. Mainframes are usually still backed up by the full volume
dump method along with incremental backups occurring at specified intervals.
But the distributed systems are mainly using the incremental forever
method employed by the major storage backup vendors. There are distributed
backup systems on the market today that do have full volume backup capability.
But one of the major concerns with using full volume dumps in the distributed
environment is lack of bandwidth on the networks. (This will probably
be less of an issue in a SAN environment)
As an example, say that a shop has 100 Unix and NT servers to back up
between the hours of 8 p.m. and 6 a.m. The distributed backups would
not be scheduled to run the 100 backups concurrently. Therefore, in
this example, the person in charge of the backups would probably schedule
five backups starting on the hour as well as five on the half hour.
This will give a back-up rate of 10 systems per hour and complete the
100 backups by the 6 a.m. deadline. Remember, most of the client servers
will be active during this back-up process. This is where the synchronicity
issue needs to be considered. Now transactions may be coming into the
systems while other servers are being backed up. There is the possibility
that a specific transaction may be backed up on different platforms
depending on back-up schedules. This can result in the transaction being
captured multiple times. There is also the possibility that the transaction
could move through the application systems and miss backups altogether.
As you can see, this type of staggered back-up scenario would present
serious shortfalls, especially to financial institutions or other businesses
that cannot afford to lose or duplicate any transactions. This is the
problem of keeping the data in sync during back-up processing.
This is not just keeping data in sync between mainframe and distributed
systems. There is just as high of a probability that the out-of-sync
problems will occur between different distributed systems. This is true
since most organizations perform distributed backups on multiple servers
during a particular back-up window, such as the midnight to 6 a.m. timeframe.
Some distributed systems may have data backed up at 1 a.m., while other
servers have data backed up closer to the 6 a.m. timeframe. In this
type of environment, there is no way to guarantee that all transactions
will be captured somewhere in the backup process.
In 24x7 environments, the storage administrator must have a firm understanding
of the flow of data. Where does the data come in from? What systems
is it passed on to? What are the interdependencies between different
systems? Where is the final resting point of the data transaction? All
of these points need to be taken into consideration when evaluating
a successful plan of backup and recovery of data.
The order of back-up schedules will probably need to be adjusted to
keep data in sync. It may be worthwhile to consider quesing particular
systems for backups, if possible. This can be a very complicated and
tedious procedure to go through to verify the validly of backups. In
a large data center with hundreds of servers, this may be nearly impossible
to do.
Another solution to the synchronicity problem can be handled with new
technology. Several new products available now are a great aid in producing
synchronized backups. Both units have the ability to help resolve synchronicity
problems in data backups. The standard approach to backups in this type
of environment is to stop all applications for a short period of time,
take a snapshot the data, and then start the application processes running
again. Downtime of critical applications due to data backups can now
be minutes instead of hours. After the data has been captured in a snapshot,
the backups and the applications can be run concurrently. The data backups
will collect data from the snapshot, not the live data that now is being
updated by the applications.
This process will work for both mainframe and distributed systems. Multiple
systems could be quesed, snapshots taken of the data and then the systems
returned to active processing status. This is a way to shorten the backup
window from hours to minutes, and more importantly, produce a backup
that is in sync between multiple systems. All of this will become more
critical within a disaster recovery event. All of the data can be recovered
so that there are no problems with duplicate records as well as the
assurance that every record is included. This type of backup scenario
will produce a good base copy of data to start from. Then incremental
backups from the different systems may be applied as necessary to bring
the data back to a specific point in time.
In conclusion, the issue of data synchronicity may or may not be a concern
in your particular environment. We are no longer given the easy task
of backing a single system in a large stand-alone window. We are now
looking at multiple systems being backed up on varying schedules. The
data center environments today present storage administrators many new
challenges working with all the individual entities. Storage administrators
should be aware of the problems that may exist when performing backups
of multiple servers using multiple schedule times. If your environment
is small enough, you may need to arrange the back-up schedules for better
coverage. If that will not work, then it may be necessary to bring up
the issue of moving your data to newer storage servers that provide
the snapshot capability. Is your complete set of backup data in sync
or not?

Jeffrey D. Blackmon, CBCP, is a senior systems consultant for Software
Systems Consulting in San Diego. He has 18 years experience in the field
of disaster recovery and business continuity for both mainframe and distributed
systems. He can be reached at jeffb@ssccorp.com.
To comment on this article, go to 1603-17
at www.drj.com/feedback.
©Copyright
2003 Systems Support Inc. All rights reserved. Reproduction in whole
or in part in any form or medium without the express written permission
of System Support Inc. is prohibited.
|