They know that the future isn’t about first-generation problems, like
lightening strikes and floods. Those are old news to data center
planners. From macroscopic approaches like redundant data centers in
different states, to more micro precautions such as redundant power
supplies and onsite power conditioning, planners are ready for the
classic “act of God” scenarios. Through rain and hail and dark of
night, computer systems will keep running and no one will notice a
thing.
Planners are also well past second-generation problems, like terrorist
attacks, sudden system outages, and man-made disasters. Again, back-up
copies of critical data and systems are made more frequently, and there
are processes rivaling those of the White House as to which
administrators will take over in the event that primary administrators
are incapacitated. (Most even designate an “Al Haig.”)
But many planners are just now beginning to come to grips with the
third generation of disaster; namely, the world of sudden loss of staff
and change in business. At the risk of focusing too closely on one
potential cause, let’s call this the pandemic world.
In the pandemic world, the first part of the disaster happens
obviously, if slowly. Staff starts to decrease. Whether it is people
actually out sick, or commute-challenged due to increased health
regulations, or simply afraid, there will be significantly fewer IT
staff available on site to make changes to the computing data center.
This adds risk to a bad situation, making the plans, infrastructure,
and processes in place to deal with a first-generation or
second-generation disaster possibly untenable.
Yet the staff issue is only the first part of the problem. The second
part of the problem occurs when the business needs to alter its
infrastructure to address the shifting worker demographic. With so many
staff out or working remotely, the load on e-mail, remote access
systems, and security/validation systems increases dramatically.
Ideally, the company would rebuild or reconfigure its computer data
center to serve this mostly off-site workforce versus the previously
in-house 9-5 workforce. But that goal runs straight into the challenges
of the staff issue, making the situation exponentially harder.
Planners see the paradox. The business will need more onsite staff to
reconfigure systems to make up for the lack of onsite staff … and we
have a third generation “slow” disaster, where a company unravels in
days or weeks instead of hours, but just as inexorably and fatally.
The solution, of course, is to plan; to start early, building an
“adaptive” or rapidly reconfigurable data center, so that machines can
be rapidly repurposed to meet business needs, in a remote and
semi-automated way, as desired. This has the added benefit of
increasing efficiency and ability to respond to first and second
generation disasters more effectively during normal operations as well.
There are several ways to get to a rapidly reconfigurable data center.
The traditional approaches include some combination of virtualization,
automated provisioning, and remote management. Although solutions for
these approaches have significantly matured over the past few years and
are available today from multiple vendors, there are still many issues
to work out.
First, each type of data center resource, such as servers, network,
storage and software (applications), requires a different
virtualization and provisioning solution, usually supplied by different
vendors. To say the least, this adds complexity and requires greater
degree of co-ordination to design and automate data center
reconfiguration. Also, adapting and maintaining such a set up becomes
harder as inevitable changes are made to the physical infrastructure
over time. The same applies to managing the changes to applications.
Second, data centers are made up of heterogeneous components. Different
makes and models of servers, storage equipment, networking gear,
operating systems, and so on. Not all components are suitable for all
purposes, and even if they were from a hardware point of view, they may
not be from a connectivity point of view, namely, LAN and SAN. In other
words, the ability to run any application on any server is a necessary
prerequisite in a rapidly reconfigurable data center, but not
sufficient. The operators also need the ability to logically “re-cable”
the server to establish the right connectivity on the LAN and SAN so
that the applications running on that server can communicate with other
systems and access their data.
While server virtualization software help neutralize the differences in
server hardware, they typically dictate a “shared everything” model for
the network and storage to solve the connectivity problem. This model
compromises security, violates traffic isolation requirements, and, in
many cases, makes it physically impossible to achieve. Automated
provisioning does not offer much to mitigate this either.
What’s needed is true server repurposing. That is, the ability to move
all aspects of a server’s operational “personality” from one physical
context to another. This includes the software, the network
configuration, the SAN configuration, as well as the associated port
configurations on the switches to which the server is physically
connected. The new server may be in a different physical location,
connected to a different set of LAN and SAN switches. It may even be a
different make and model. It may not even be a physical server.
Regardless, true server repurposing must transcend all these challenges
by providing an abstraction that normalizes all the variables. It
should be the logical equivalent of ripping the disks, the NICs, the
HBAs, and the switch ports out from the original server/switches and
reinstalling all of it at a new location. In effect, server repurposing
makes use of the server, storage, and network virtualization already in
place and ties them together into a simple operational framework that
is focused on IT response to events and not infrastructure or process
design.
With true server repurposing, you can rack once, cable once, and repurpose your servers repeatedly and effortlessly.
Here’s our thesis: planners today have the tools to meet the pandemic
challenge. It may require some unconventional, out-of-the-box thinking,
but that’s par for the course for IT planners anyway.
"Appeared in DRJ's Spring 2007 Issue"
Double Jeopardy In A Disaster - Computing Data Center Challenges In A Pandemic W
Written by Kevin Epstein Friday, 19 October 2007 13:16Login to post comments




