|
DISASTER
RECOVERY
JOURNAL
P. O. Box 510110
St. Louis, MO 63151
(314) 894-0276
Fax: (314) 894-7474
Internet
www.drj.com
E-mail drj@drj.com
PUBLISHER &
EDITOR-IN-CHIEF
Richard L. Arnold, CBCP
richard@drj.com
SENIOR EDITOR
Janette Ballman
janette@drj.com
MANAGING EDITOR
Jon Seals
jon@drj.com
COPY EDITORS
Richard Sandhofer
richards@drj.com
Pamela Clifton
pamelaclifton@hotmail.com
ADVERTISING
Robert Arnold
bob@drj.com
_____________
Corporate
President/CEO
Richard L. Arnold, CBCP
richard@drj.com
Vice
President
Robert Arnold
bob@drj.com
CONFERENCE COORDINATOR
Patti Fitzgerald, CBCP
patti@drj.com
CONFERENCE REGISTRAR
Merce Knese
mercedes@drj.com
CIRCULATION
Laura Baugh
laurab@drj.com
EXECUTIVE
COUNCIL
Patrick Corcoran, IBM Bus. Cont. & Rec. Services
Jeff Dato, MBCP, KPMG
Edward S. Devlin, E.S. Devlin & Associates
Judith Eckles, SunGard Availability Services
James Hammill, CBCP, JMH Consulting Inc.
John Jackson, Independant
INTERNATIONAL
CONTACTS
England: Thom Hetherington
Business Continuity
Phone: 0161-237-1007
thomh@tempus.demon.co.uk
Australia: Anthony J. Harvey
Journal of Business Continuity
Phone: 0011-613-953-0055-8
fax: 0011-613-953-0528
sector@notability.com.au
Japan: Shinji Hosotsubo
Quake Japan Co., Ltd.
Phone: 03-3215-2880
fax: 03-3215-2881
Brazil:
Jose Carlos Ferreira
Disaster Recovery Mercosul
Phone: 55
11 3666-9506
conc2000@uol.com.br
www.drms.com.br
|
|
Click
Here for a Printable Version
DATA
RECOVERY
Multi-Terabyte
Data Recovery In A Few Clicks
By JEFF IVERSON
Because data is the backbone
of today’s organizations, immediate recovery of data during IT
outages is key to survival. It’s time to shift our focus from
quick backup to near-instant recovery time.
When data is lost or damaged for any reason, it negatively impacts or
completely halts business processes; customer services, income generation,
and the business’ reputation are put at risk, resulting in an
enterprise-wide dilemma. Adequate backups protect against permanent
loss of data; however, the time it takes to rebuild and recover from
a data disaster can have catastrophic business consequences.
The key to avoiding this potential calamity is near-instant data recovery
(from any type or size of data mishap). It is possible to implement
a process that “rewinds” or rolls back the data image to
any precise point in time, giving data administrators the means to recover
from system crashes, data corruption, and any type of data failure in
minutes.
This process is application neutral and operates with any version of
any application; it works with existing storage and back-up systems,
combining local and/or remote application replications with real time
data backup to create an extra layer of protection and provide immediate
recovery following any type of data disaster. It addresses the need
for business continuance versus guaranteed backup.

Why Is It Different?
Currently, the most common anti-disaster protection methods employed
include:
• automated backups
• off-site media storage
• data mirroring
• remote data replication
• snapshot of data
Automated backups ensure that files are
continually backed up (commonly onto tape) on a routine basis; a second
copy is often stored off-site for safekeeping. Rebuilding “live”
data from tape is cumbersome and time-centric, requiring an average
of 17-25 hours to restore a one-terabyte data volume. If the tape must
be recovered from an off-line and off-site storage facility, recovery
may be measured in days.
Data mirroring creates a duplicate (secondary) on-line copy that in
an emergency replaces the primary data. Remote data replication tasks
simultaneously duplicate data to a secondary system, ensuring continuous
access should a primary system fail. Data mirroring and remote replication
techniques eliminate the wait time associated with the loading and restoring
of back-up tapes following a data disaster, as the replicated data can
be substituted quickly. However, they each have the same inherent limitation
– both techniques may replicate corrupted data as it enters the
system, leaving a damaged database being copied for recovery. These
disadvantages can be addressed with snapshots by adding a point-in-time
data image feature to existing back-up and recovery procedures. By doing
this, the chance of reintroducing corrupted data during the recovery
process is reduced. However there is still the risk of data loss between
point-in-time images and the snapshot process impacts the system and
application.
Continuously capturing data makes possible a rollback to any point in
time, eliminating the limitations of snapshots. This approach facilitates
the recovery of vast amounts of data by backing out data corruption,
rather than rebuilding from archives and snapshots. Applications can
be restored accurately and verifiably with unprecedented speed.
With the ability to back out the corrupted data rather than rebuild
the entire data image, full data restores following a major data disaster
occur in minutes versus hours. For example, 1 TB or more of data can
be fully restored in less than 20 minutes.
How Rollback Works
By maintaining a history journal always queued for immediate recovery,
there is a constant running record of application activity and thus
it is possible to “time slide” forward or backward to any
point in time and recover data in its uncorrupted state. Writes are
continuously intercepted and tagged as they occur in real time without
altering the actual data or impacting the application. A second process
(typically on a remote system) maintains an active replication and journals
a history of data activity. The recording and journaling of data has
no impact on the running applications or server operations, requires
very little storage space and facilitates the ability to scroll the
data to any point in time.
Because it’s working at the block level, this process is not dependent
upon specific applications or storage solutions and is compatible with
and provides near real time recovery for any application, relational
database and file system; it also is not constrained by protocols or
formats. Because everything is tracked on disk, a rollback to any point
in time is possible, enabling the user to back out the corruption rather
than rebuild the data structure. Remember that this process does not
replace your existing data archive solution. It adds an additional layer
of protection, providing for almost immediate data recovery following
any type of data disaster.
When data restore is necessary, a roll back through the history journal
can establish a precise restore point. Once a valid restore point is
identified the rollback sequence is committed to the production application
data. Recovery methodologies include full and partial restoration and
disaster procedures.
Full recovery is accomplished by rolling back a copy on the back-up
server to validate the restore point. The same rollback sequence is
applied to the production application to resynchronize the active data.
Optionally the rollback can be applied directly to the production application,
bypassing the validation on the back-up server.
Partial recovery is accomplished in the same manner to determine the
restore point. Affected tables and records are extracted and then inserted
into the production application. A tremendous benefit of this partial
recovery process is that the running database remains operational for
applications not accessing these specific records.
Disaster recovery is accomplished by pointing to the virtual copy of
the application’s data on the back-up server. The applications’
image is “rewound” on disk back to any “live”
transactional processing point in time.
To date, this rollback recovery process has been successfully tested
with Solaris (SUN) and AIX (IBM) operating systems and coupled with
existing file back-up solutions, providing near zero restoration time.
It runs across two servers – the application (or production) server
and a back-up server; it co-resides on the server holding the storage
management software. Requirements for implementation are a 300MHz+ processor
with a minimum of 512MB of RAM and 2 percent extra disk space on the
client side, and a 300MHz+ processor with 1GB of RAM and 120 percent
of disk of the client on the server side.
This non-intrusive continuous data capture and immediate recovery process
is raising the bar for data recovery performance and business continuity
standards. Administrators should frequently revisit their enterprises’
in-place business continuance plan to ensure that it is taking into
account newly introduced disruptive advances and enhancements in recovery
technology. One such important transformation is occurring now.
Jeff Iverson is vice president
of strategic and technical alliances for Vyant Technologies, Inc. (www.vyanttech.com),
a Fairfax, Va., software development company. Iverson has 22 years experience
in storage management, systems integration and application development.
To comment on this article, go to 1603-18
at www.drj.com/feedback.
©Copyright
2003 Systems Support Inc. All rights reserved. Reproduction in whole
or in part in any form or medium without the express written permission
of System Support Inc. is prohibited.
|