| DISASTER
RECOVERY
JOURNAL
P. O. Box 510110
St. Louis, MO 63151
(314) 894-0276
Fax: (314) 894-7474
Internet
www.drj.com
E-mail drj@drj.com
PUBLISHER &
EDITOR-IN-CHIEF
Richard L. Arnold, CBCP
richard@drj.com
SENIOR EDITOR
Janette Ballman
janette@drj.com
MANAGING EDITOR
Jon Seals
jon@drj.com
COPY EDITORS
Richard Sandhofer
richards@drj.com
Pamela Clifton
pamelaclifton@hotmail.com
ADVERTISING
Robert Arnold
bob@drj.com
_____________
Corporate
President/CEO
Richard L. Arnold, CBCP
richard@drj.com
Vice
President
Robert Arnold
bob@drj.com
CONFERENCE COORDINATOR
Patti Fitzgerald, CBCP
patti@drj.com
CONFERENCE REGISTRAR
Merce Knese
mercedes@drj.com
CIRCULATION
Laura Baugh
laurab@drj.com
EXECUTIVE
COUNCIL
Jeff Dato, MBCP, KPMG
John Jackson, J Albright Advisors
Edward Devlin, E.S. Devlin & Associates
James Hammill, CBCP, JMH Consulting
Pat McAnally, SunGard Availability
Brian Turley, Strohl Systems
Belinda Wilson, Hewlett-Packard
INTERNATIONAL
CONTACTS
England: Thom Hetherington
Business Continuity
Phone: 0161-237-1007
thomh@tempus.demon.co.uk
Australia: Anthony J. Harvey
Journal of Business Continuity
Phone: 0011-613-953-0055-8
fax: 0011-613-953-0528
sector@notability.com.au
Japan: Shinji Hosotsubo
Quake Japan Co., Ltd.
Phone: 03-3215-2880
fax: 03-3215-2881
Brazil:
Jose Carlos Ferreira
Disaster Recovery Mercosul
Phone: 55
11 3666-9506
conc2000@uol.com.br
www.drms.com.br
|
|
Click
Here for a Printable Version
FICON
and Mainframe Disaster Recovery Insourcing
By STEVE GUENDERT & RICK BOYD
Several events in the past five years have dictated a
shift in thinking regarding how companies recover from a disaster. While
most large organizations have had a second datacenter for recovery either
because it made sense or ease of restoration, smaller companies relied
on the tried and true hotsite. A hot site is a shared service site that
would accommodate multiple organizations for disaster recovery operations.
As many companies saw during the blackouts, hurricanes and terrorist
attacks, the hot site was not the disaster recovery life insurance policy
that was promised. The methodology works fine for a contained event
that does not affect a large geographic area. However, when there is
a wide spread event, such as 9/11, then the hot site is quickly overwhelmed
with multiple companies who simultaneously declare a disaster and they
just cannot accommodate everyone in the facility they have been accustomed
to using.
Couple triage fashion that a shared site strategy uses when disasters
are declared with regulatory concerns and we see a trend toward bringing
disaster recovery in-house. This phenomenon is referred to as DR insourcing.
FICON technology is an enabler to insource DR much more cost effectively
than it would be with ESCON. In addition, FICON’s performance
advantages when compared with ESCON make it the technology of choice
for meeting RPO and RTO objectives.
DR Insourcing
This approach, bringing disaster recovery back in-house, addresses many
of the complaints that surround the traditional hot site recovery scenario:
- Money – We spend a decent amount of money on a hot site and
do not see any sustained benefit
- Success – Let’s face it, most tape-based recoveries
performed at a hot site fail. There will be a signature on paper declaring
the test was a success, but in most cases there were files missing
or applications that could not run successfully and that is in a controlled
test where great care was taken to checkpoint all the data, what would
happen in a real disaster situation?
- Use – Shouldn’t there be a way to get use out of disaster
recovery money instead of it just being an insurance policy you hope
you never need?
- Guarantee – Even though we are paying regularly for the right
to use a hot site, there is no guarantee we will be able to recovery
where we normally test
- Prohibitive Cost – A hidden cost in all hot site contracts
is the “declaration fee.” This is a fee charged when the
client declares there has been a disaster and wants to utilize the
hot site facilities. This, many times, precludes an organization from
declaring a disaster for a single application or applications
For these reasons, and others, many companies are now looking to DR
insource. The methodology is pretty straight forward. Either utilize
existing facilities, or leverage the myriad datacenter floorspace that
is available to deploy a disaster recovery solution that is owned and
managed internally while utilizing today’s technology to get use
of the equipment during non disaster recovery times. Finally, weigh
the cost-benefit trade offs and evaluate whether or not to build a new
data center geared toward insourcing DR.
FICON and DR Insourcing
The greater bandwidth and distance capabilities FICON has over ESCON
are starting to make it an essential and cost effective component in
HA/DR/BC solutions. As mentioned earlier, since Sept. 11, 2001, more
and more companies are insourcing DR. Those that are doing so are building
the mainframe piece of their new DR/BC datacenters using FICON, rather
than ESCON. And more and more this includes cascaded FICON.
Cascaded FICON refers to an implementation of FICON that involves one
or more FICON channel paths to be defined over 2 FICON directors that
are connected to each other using an Inter-Switch Link (ISL). The processor
interface is connected to one director, while the storage interface
is connected to the other. This configuration is supported for both
disk and tape, with multiple processors, disk subsystems and tape subsystems
sharing the ISLs between the directors.
Until FICON cascading, the FICON architecture has been limited to a
single domain due to the single byte addressing limitations inherited
from ESCON. FICON cascading allows the end user to have a greater maximum
distance between sites: up to an unrepeated distance of 36 km at 2 Gb/sec
bandwidth.
Sept. 11, 2001, underscored how critical it is for an enterprise to
be prepared for disaster. Even more so for large enterprise mainframe
customers. A complete paradigm shift has occurred since 9/11 when we
discuss DR/BC. Disaster recovery is no longer limited to problems such
as fires or a small flood. Companies now need to consider and plan for
the possibility of the destruction of their entire data center, and
possibly the people that work in it. A great many articles, books, and
other publications have discussed the IT “lessons learned”
from Sept. 11, 2001:
1) To maintain business continuity it is absolutely critical to maintain
geographical separation of facilities and resources. Any resource your
enterprise has that cannot be replaced from external sources within
your recovery time objective (RTO) should be available within the enterprise.
It is also preferable to have these resources in multiple locations.
We’re talking about buildings, hardware, software, data, and staff.
Cascaded FICON gives this geographical separation that post 9/11 business
requires; ESCON does not.
2) The most successful DR/BC implementations are oftentimes based on
as much automation as possible. Sept. 11 proved that key staff and skills
may no longer be present after a disaster strikes.
3) Financial, government, military, and other enterprises now have critical
RTO that are seconds or minutes and not days and hours. For these end
users it has become increasingly necessary to implement in in-house
(insourced) DR solution. Cascaded FICON allows for considerable cost
savings compared with ESCON when insourcing DR/BC/HA.
Issues
There are some items that need to be addressed when attempting DR insourcing.
First and foremost are licensing concerns. Particularly on a mainframe,
the costs can be prohibitive. However, more and more there are some
creative licensing deals where emergency use licenses are low cost until
they are enabled at the time of a disaster. On a mainframe the test/dev
partitions can be moved to the DR site and a DR partition can be waiting,
not being used until testing or an actual event.
A second item which must be addressed is the proximity to the main production
site. How far away should it be? And what methods are there to get data
to the recovery site? How much cost savings and performance efficiencies
will I get by using FICON as opposed to ESCON as the protocol for extension?
Benefits
While it looks on the surface that the cost to bring disaster recovery
in-house are prohibitive, think of the amount of money being spent for
a hot site on a yearly basis. A company I have worked with recently
confided they spent about $36,000 a month for their hot site contract.
That equates to more than $400,000 a year just for insurance.
We haven’t even discussed the cost associated with sending tape
off site. If that can be cut down or eliminated we are talking about
a significant amount of money which can be used to fund a new DR strategy.
Even though the initial entry fee to mirror data and install additional
equipment at the insourced DR site would be easily four to five times
the yearly hot site cost, there are tangible benefits that cannot be
ignored:
n Control – You own the site and the equipment and you decide
how or when to use it
n Similar Costs – Admittedly the initial outlay will cause some
sticker shock, but once the site is deployed the monthly cost savings
will allow for a break even point between two and three years. From
that point on, technology refreshes will be on par with the monthly
hot site costs
n Increased Testing – No longer will you have to wait to test
your disaster recovery plan and spend extra money to fly, feed, and
house your employees. Since the site is always connected to your primary
facility, a disaster test can be more spontaneous and closer to “real
world” than the “staged” tests at a hot site
n Ability to reduce tape costs – Technology is such that tape
is being replaced by low cost disk (virtual tape) and that solution
can allow for mirroring the data between sites. Additionally, tape costs
can be reduced by making tape perform the role it was meant to …
archival
n Ability to build data center at lower cost with FICON – The
performance gains of FICON over ESCON have allowed FICON adopters to
significantly consolidate not only their channel environment, but also
to significantly consolidate their disk and tape storage onto fewer
footprints while getting performance improvements. FICON DASD will typically
yield 40 percent or better improvements in subsystem response times,
while allowing the end user to consolidate “X” TB onto fewer
DASD array frames.
Getting Started
In much the same way as initially deploying a disaster recovery strategy
can be daunting, so to can the process for DR insourcing. First things
first, if you have not already done so categorize applications into
tiers of recovery. In today’s world, recovering everything at
the same time is not feasible.
After the applications are categorized, both a technology strategy and
a secondary site need to be developed. Paramount in this effort, understanding
the RTO and RPO for each tier in order to put the proper technology
to each tier. The RTO and RPO will dictate both the technology that
needs to be used in order to replicate data and how far away the recovery
site can be. The lower the RPO, the closer the site has to be in order
to ensure the data is close to the point of interruption.
Don’t try to boil the ocean. My recommendation is to take sections
of the disaster recovery strategy and implement in pieces over time.
For instance, think of putting a remote tape, and recovery system of
course, in the recovery site and not changing the RTO/RPO initially.
This will allow the organization to implement the DR Insourcing strategy
without having to incur higher costs for replication of data other than
the remote tape solution. The second phase can introduce top tier data
replication and subsequent phases can enable enhanced recovery for all
other tiers over time. By deploying in phases an organization can spread
out the costs while implementing a DR Insourcing strategy and reducing
the RTO and RPO.
Conclusion
Mainframe DR insourcing may not be right for your organization. However
there are benefits that demand serious consideration. If, for instance,
you can keep your run rate at a similar level or just slightly increase
it and get use out of your disaster recovery equipment then the cost/benefit
analysis will look that much better. In fact, DR insourcing changes
the disaster recovery model from that of an insurance policy to one
of a dual-use situation.
Bear in mind, if an organization decides this is a path they want to
investigate, the analysis alone could take months. Beyond that there
are political issues to contend with when trying to change the current
disaster recovery objectives. Some business units will not want to hear
their application doesn’t bear enough merit to be in the top tier
of recovery. With due-diligence and a solid costing strategy, the conversation
is much easier to have. The business unit can argue that their application
should be in the higher tier and can also get into that higher tier
if they are willing to pay the higher costs associated with that recovery
objective. Many times the business unit, when presented with a bill,
will acquiesce if you are going to affect their bottom line.
Steve Guendert is McDATA’s FICON principal consultant and
is regarded as one of the mainframe industry’s FICON experts.
You may reach him at stephen.guendert@mcdata.com.
Rick Boyd is McDATA’s technology recovery principal consultant
and is a trusted advisor for planning both mainframe and open systems
BC/DR throughout the financial industry. You can e-mail him at rick.boyd@mcdata.com.
©Copyright
Systems Support Inc. All rights reserved. Reproduction in whole or in
part in any form or medium without the express written permission of
System Support Inc. is prohibited.
|