| DISASTER
RECOVERY
JOURNAL
P. O. Box 510110
St. Louis, MO 63151
(314) 894-0276
Fax: (314) 894-7474
Internet
www.drj.com
E-mail drj@drj.com
PUBLISHER
Richard L. Arnold, CBCP
richard@drj.com
EDITOR-IN-CHIEF
Jon Seals
jon@drj.com
SENIOR EDITOR
Janette Ballman
janette@drj.com
COPY EDITORS
Richard Sandhofer
richards@drj.com
Pamela Clifton
pamelaclifton@hotmail.com
ADVERTISING
Robert Arnold
bob@drj.com
_____________
Corporate
President/CEO
Richard L. Arnold, CBCP
richard@drj.com
Vice
President
Robert Arnold
bob@drj.com
CONFERENCE COORDINATOR
Patti Fitzgerald, CBCP
patti@drj.com
CONFERENCE REGISTRAR
Merce Knese
mercedes@drj.com
CIRCULATION
Laura Baugh
laurab@drj.com
EXECUTIVE
COUNCIL
Jeff Dato, MBCP, KPMG
John Jackson, J Albright Advisors
Edward Devlin, E.S. Devlin & Associates
James Hammill, CBCP, JMH Consulting
Pat McAnally, SunGard Availability
Brian Turley, Strohl Systems
Belinda Wilson, Hewlett-Packard
INTERNATIONAL
CONTACTS
England: Thom Hetherington
Business Continuity
Phone: 0161-237-1007
thomh@tempus.demon.co.uk
Australia: Anthony J. Harvey
Journal of Business Continuity
Phone: 0011-613-953-0055-8
fax: 0011-613-953-0528
sector@notability.com.au
Japan: Shinji Hosotsubo
Quake Japan Co., Ltd.
Phone: 03-3215-2880
fax: 03-3215-2881
|
|
Click
Here for a Printable Version
Four Common Mistakes to Avoid When Moving
Servers
By AMICHAI
LESSER
Server relocations and data center consolidation
can deliver significant benefits – including enhanced business
continuity, optimized disaster recovery schemes, cost savings, better
service management, and improved regulatory compliance.
But the risk associated with moving servers further away from your
end-users should not be ignored. That’s because users who were
local to servers become remote users, and the interim stages in a data
center relocation may introduce distance between back-end servers.These
physical displacements can negatively impact application performance
and result in significant business interruption.
In fact, when IT organizations plan server moves, they often focus
exclusively on systems issues such as the right-sizing of new servers
or virtualization of storage resources. As important as those issues
are, it’s a big mistake to ignore the impact of adding distance
across the network. If you don’t adequately understand and address
the issues that arise when you put more physical distance between users
and servers – or between servers and servers – you can
set yourself up for serious pain and potential failure.
Here are four common mistakes you should be particularly careful to
avoid:
1) Confusing network latency with application latency
When you move servers further away from users, you introduce network
latency. That is, the physical distance between users and servers causes
a delay in the signal between the two. But adding 50 milliseconds of
network delay doesn’t mean that your application response times
will only increase by 50 milliseconds. On the contrary, most applications
require many back-and-forth interactions between user and server (often
referred to as application “turns”) to perform even the
most basic tasks. Thus, the addition of just 50 milliseconds of network
delay can cause an action that only took three seconds to complete
locally a full 30 seconds to complete after a server move.
Unfortunately, this network-related latency is usually regarded as
the network manager’s problem, even though the application (including
the number of “turns” it requires) may be the real problem.
After all, the network manager can’t change the speed of light,
or make Tokyo closer to New York. So it doesn’t make sense to
lay the problem entirely on him or her. In fact, because application
design issues are often responsible for poor response times after a
server move, additional investments in the network will be of little
use whatsoever.
2) Failing to realize how network latency impacts server
performance and scalability
Many IT organizations don’t fully grasp how the addition of network
latency degrades – often substantially – the scalability
and performance of application servers. This often-overlooked phenomenon
has an adverse impact on the entire user population – not just
remote users. It is almost never caught in the QA process, and is rarely
diagnosed correctly even when it causes problems in the production
environment.
How does network latency affect server performance? The answer is simple.
A server allocates resources to each concurrent client session. Local
clients complete these sessions quickly because their application turns
are subject to minimal network-related delay. Remote sessions, on the
other hand, take much longer to complete because each application turn
takes so much longer.
It is important to note that servers lock up resources for the duration
of the process, and only free them when the process is completed. Thus,
when remote users communicate with a server, they keep its resources
busy for a longer period of time. This prevents the server from releasing
those resources for use by other clients – severely limiting
its performance and ability to scale.
Unfortunately, conventional testing and QA typically focus on back-end
scalability, with little or no attention given to real-world network
latencies. That is why IT organizations are so often surprised when
server performance degrades after a data center move.
3) Ignoring business continuity best practices during
interim stages of server relocation
Ideally, an enterprise could pack all of its servers in one weekend,
load them on moving trucks, unpack in the new location, and be up and
running by Monday. The reality is quite different. Enterprise data
centers can consist of dozens or hundreds of servers. It can take weeks
or months and multiple relocation steps to complete a move to a new
location. Thus, during interim stages, some servers will operate from
their original locations while others will operate from the new location.
The introduction of this physical distance between servers can seriously
impact both business continuity and application performance.
Most business continuity schemes depend on a contingency site, which
is provisioned with replicated enterprise data. In most cases, all
the data to be replicated comes from a single location: the data center.
But, during a data center move, some data sources will reside in the
old data center and some will have already moved to the new location.
This distribution of data sources complicates disaster recovery and
introduces new vulnerabilities to the IT environment.
The physical separation of servers can also have a dramatic and unexpected
impact on application performance, because computing processes are
almost never designed to accommodate significant inter-server latency.
Any IT organization planning a data center move must therefore ask
a variety of questions. Did I adjust my disaster recovery plan to cover
interim relocation steps? What happens when servers with critical inter-dependencies
are temporarily separated? Which servers must be moved with other servers?
When should active directory servers be moved? Which servers will need
to be replicated for the duration of the move?
4) Not dealing with users’ performance expectations until after
the move
Sometimes, it simply doesn’t make sense to set a post-relocation
service level objective (SLO) that is identical to what had previously
been a local response time. If it took a local user three seconds to
execute a task before a server move, it is very unlikely that the task
will take the same amount of time after that server is moved across
the country. An SLO of seven seconds, for example, may be more reasonable.
That’s why it is critical to directly address users’ service
level expectations up front. If you wait until after the move and tell
users they just have to live with what you can deliver, you’re
setting yourself up for a battle. But if you can get buy-in beforehand
as part of the planning process, you can avoid such hassles and ensure
that no one has unrealistic expectations.
To achieve this pre-deployment acceptance, two elements are needed.
First, IT must have a way of predicting what post-move performance
will look like. Second, users must be given a way to experience post-move
performance in advance. That is, IT must be able to simulate post-move
performance. These predictive and simulation capabilities enable IT
to set up “acceptance environments” where users can experience
post-move performance first-hand before the move is actually executed.

Seven Steps for Project Success
To avoid making these mistakes, IT organizations must have full visibility
into the subtle, complex interactions between applications, networks
and infrastructure. Unfortunately, responsibility for these three
areas has been split into separate operational “silos.” A
siloed approach, however, reduces the likelihood that IT will successfully
predict and address the performance problems that can result from
a data center move. It is therefore essential to take a new collaborative
approach that effectively blends the expertise of the application
team, systems managers and network architects. These collaboration
best practices are outlined in the following seven-step plan:
Build a virtual model of the pre- and post-relocation enterprise
environment, as well as all planned transitional phases. All participants
in the planning process, including business users, need concrete information
about how network infrastructure will impact application performance
with the new data center.
Establish an SLO baseline by measuring application performance
before the move. Users’ needs and expectations don’t exist
in a vacuum. Pre-move transaction response times provide essential
context for determining reasonable SLOs for after the move.
Measure post-move application performance in a virtual environment.
The only way to accurately predict the impact of server moves on application
performance is to run those applications in a fully simulated post-move
environment. This will provide the specific data on potential performance
degradations essential for proper planning.
Identify applications that need special performance tuning.
Rather than wasting time, effort, and money on beefing up all elements
of your enterprise infrastructure, focus instead on specific applications
and/or network components that may be particularly problematic.
Analyze problems and validate potential fixes for failing
applications. Before investing in and deploying a solution, it’s
important to make sure that it actually works.
Assess dependencies between back-end servers to establish
a move plan and adjust the DR scheme. This, too, should be done by
simulating each planned interim stage of the move – as well as
the final post-move environment.
Manage user expectations and get buy-in commitments through
hands-on acceptance. Users who merely hear that a transaction response
time will go from two seconds to five may object out of sheer reflex – or
they may accede without realizing how long five seconds really is.
Business users should therefore be given the opportunity
to directly experience post-move application performance in advance
so they can offer informed consent to the relocation plan.
By following this seven-step plan, IT organizations can substantially
reduce risk, eliminate unnecessary infrastructure spending, accelerate
time-to-benefit, and overcome a wide range of potential political pitfalls.
The exclusion of any of these steps greatly increases the likelihood
that unforeseen problems will sabotage the project. To ensure the success
of any data center relocation or consolidation initiative, IT must
pool its expertise in cross-disciplinary planning teams and fully leverage
available simulation technologies.
Amichai Lesser is the director of product marketing at Shunra Software.
Lesser is responsible for product marketing, market analysis, and field
marketing programs and has extensive experience in real-time engineering,
performance management, and security. Lesser can be contacted at amichai.lesser@shunra.com.
©Copyright
Systems Support Inc. All rights reserved. Reproduction in whole
or in part in any form or medium without the express written permission
of System Support Inc. is prohibited.
|