|
DISASTER
RECOVERY
JOURNAL
P. O. Box 510110
St. Louis, MO 63151
(314) 894-0276
Fax: (314) 894-7474
Internet
www.drj.com
E-mail drj@drj.com
PUBLISHER &
EDITOR-IN-CHIEF
Richard L. Arnold, CBCP
richard@drj.com
SENIOR EDITOR
Janette Ballman
janette@drj.com
MANAGING EDITOR
Jon Seals
jon@drj.com
COPY EDITORS
Richard Sandhofer
richards@drj.com Pamela
Clifton
pamelaclifton@hotmail.com
ADVERTISING
Robert Arnold
bob@drj.com
_____________
Corporate
President/CEO
Richard L. Arnold, CBCP
richard@drj.com
Vice
President
Robert Arnold
bob@drj.com
CONFERENCE COORDINATOR
Patti Fitzgerald, CBCP
patti@drj.com
CONFERENCE REGISTRAR
Merce Knese
mercedes@drj.com
CIRCULATION
Laura Baugh
laurab@drj.com
INTERNATIONAL
CONTACTS
England: Thom Hetherington
Business Continuity
Phone: 0161-237-1007
thomh@tempus.demon.co.uk
Australia: Anthony J. Harvey
Journal of Business Continuity
Phone: 0011-613-953-0055-8
fax: 0011-613-953-0528
sector@notability.com.au
Japan: Shinji Hosotsubo
Quake Japan Co., Ltd.
Phone: 03-3215-2880
fax: 03-3215-2881
Brazil:
Jose Carlos Ferreira
Disaster Recovery Mercosul
Phone: 55
11 3666-9506
conc2000@uol.com.br
www.drms.com.br
|
|
Click
Here for a Printable Version
DATA STORAGE
Protecting
Your Data With Adequate Storage, Backup
By DEREK GAMRUDT
Unexpected business
data disasters happen all of the time. They can occur anywhere and do
not need to be a headline-grabbing event such as an earthquake or major
fire to cause serious problems. Actually, most data disasters are the
result of a small mishap a lost file that was not saved, a thrown
out or misplaced disk or tape, an inadvertent deletion of a critical
file, or a power surge that wipes out your media.
Any event that results in lost data can seriously impact
a business; lost productivity, degraded customer services, liability
problems, and decreased revenues are just a few of the possible negative
ramifications. In the areas of e-commerce and transaction processing,
the guaranteed availability and reliability of stored information is
demanded 24 hours a day, making data backup and protection a critical
business function. Todays businesses cannot afford a data disaster;
they must have a data backup and recovery plan in place.
However, admittedly, data storage and backup is confusing. Designing
complex storage solutions is more an art than a science. The variables
to be considered are many, and the debate among storage vendors is approaching
that of a religious debate. The level of confusion among the end-user
community has reached the point where some have simply abandoned the
concept of protected networked storage until the market can come to
a consensus as to how it should best be deployed.
Lets try to put into perspective the ever-changing storage management
landscape and talk about the many options for storing and safeguarding
your organizations most critical asset its data. And lets
do it in simple, straightforward terms.
Networked Storage Simplified
From the start, we will agree to not lead into a discussion about storage
networks using acronyms without first explaining each one. And secondly,
we will further unmask the storage mystique by simply categorizing
all of the storage implementation options into one of two areas.
Whether it is Network Attached Storage (NAS) or a Storage Area Network
(SAN), Fibre Channel (FC) or Internet Protocol (IP), the only alternative
to Direct Attached Storage is Networked Storage.
A Brief History Of Storage Connectivity
Options
In the beginning, the earth was dark and without form. Shortly thereafter,
mainframes dominated the planet and storage was handled by simply plugging
disk and tape subsystems into a mainframe channel. Eventually, open
systems such as UNIX, OS/2, Novell, and Windows NT began getting deployed
in great numbers, and a new high performance interface, named the Small
Computer Systems Interface (SCSI), was developed to deal with the massive
140 Megabyte (MB) disk drives of the day. Initially, SCSI-1 had a meager
5 MB/Second throughput rate, but over time, SCSI throughput doubled
with each new version, and we now have SCSI throughput speeds of 160
MB/Second. The concept of attaching the storage device directly to the
server is often referred to as Direct Attached Storage (DAS) or Server
Attached Storage (SAS)
So now you are probably wondering, Heck, thats faster than
storage networks. Why in the world mess with putting storage on the
network in the first place? An excellent question indeed.
You see, direct attached SCSI or DAS has significant limitations. There
are limits to the number of devices that can be attached to a SCSI Host
Bus Adapter (commonly referred to as an HBA); 6-15 devices was the maximum.
There is also a limit as to the distance that a SCSI device can be separated
from its server (6-25 meters). Deploying a DAS device also meant that
in order to access it, you had to attach the storage unit (e.g., RAID,
disk jukebox or tape library) to the server that was managing that device,
which can be an administrative nightmare. There had to be a better way.
Enter Network Attached Storage
In 1987, Auspex systems introduced the worlds first Network Attached
Storage (NAS) server, a high-powered, thin file server with large storage
capacity for the growing demand of networked users sharing files. Taking
advantage of the growing popularity of Sun Microsystems Network Files
System (NFS), Auspex offered companies the ability to place storage
directly onto the network where it could easily be shared by the users
without attaching to a general purpose server. The response was dramatic,
and the first networked storage architecture was firmly ensconced in
the vernacular of the storage market.
The fundamentals of NAS have changed little over the years. There is
now support for Microsofts Common Internet File Systems (CIFS)
and the Network Data Management Protocol (NDMP) is now used to move
data to a backup device, but the basics are the same. A NAS solution
is essentially an appliance that consists of a special purpose operating
system and processor that is optimized to serve and store data at the
file-level across a TCP/IP network.
Storage Area Networking
SCSI-based storage and NAS-based configurations are both important ways
of bringing storage to the network, but they are best utilized in situations
where there is a relatively low volume of data traversing the network.
This is because the movement of large amounts of data files between
the server and the storage device can gobble up available network bandwidth,
and cause degradation of LAN performance. In short, the storage-to-server
file transfers hog the networks pipeline, shutting out or limiting
its availability to users.
Large enterprises that want the ability to store and manage vast amounts
of information while maintaining an overall high-performance network
environment now have another option: the Storage Area Network (SAN).
In a SAN environment, storage devices such as tape libraries and RAID
arrays can be connected to a storage switch and can communicate with
servers executing on different platforms. These communications paths
between server(s) and storage device(s) are via a high-speed interconnection,
such as Fibre Channel (FC), or Internet Protocol (IP)-based approaches
such as Internet SCSI (iSCSI) or Storage over IP (SoIPÔ). These
setups allow for any-to-any communication among all devices on the SAN.
It also provides alternative paths from server to storage device. In
other words, if a particular server is unavailable, another server on
the SAN can access the storage device. A SAN also makes it possible
to mirror data, making multiple copies available. The high-speed interconnection
that links servers and storage devices essentially creates a separate,
external network thats connected to the LAN but acts as an independent
network.
There are a number of advantages to SANs. SANs allow for the addition
of bandwidth without burdening the messaging network, or LAN. SANs also
make it easier to perform online backups without users feeling the bandwidth
pinch. SANs also provide a method for scaling up storage capacity without
interrupting network operations.
The Future
As if things are not confusing enough, there is a new storage connection
interface that is being developed, InfiniBand. InfiniBand is not just
a fancier storage area network; in fact, the InfiniBand documentation
stresses that it is a system area network. It supports not only storage
devices but also other system peripherals, including input, video, graphics,
and output devices. InfiniBand merges both storage area networks and
system area networks and gets the PCI bus out of the way. Computers
will still have an internal path to memory for communication within
the box, but InfiniBand interfaces talk directly to the memory controller,
bypassing the PCI bus. This is the same principle as the old mainframes
Direct Memory Access (DMA) bus. However, the end nodes in an InfiniBand
network can be computers, routers, or I/O devices (such as SCSI disks,
Fibre Channel networks, or even video boards).
InfiniBand grew out of two separate initiatives aimed at eliminating
the current limitations of the PCI bus. Intel announced Next-Generation
I/O (NGIO) in 1998. Compaq Computer, Hewlett-Packard, IBM, and 3Com
developed a competing standard called Future I/O. The two standards
were remarkably similar: both had a switched fabric, channel-based communication
bypassing the traditional I/O bus. In fact, in the early stages, the
preliminary designs were difficult to distinguish from one another.
In 1999, the two groups got together and decided to merge their proposals
into System I/O, which became InfiniBand. InfiniBand appears to be a
general solution, combining aspects of storage area networks, system
area networks, and I/O buses. Compaq, Dell, HP, IBM, Intel, Microsoft,
and Sun lead the InfiniBand Trade Association.
Positioning Storage Options
In order to position the various technologies, we need to expand on
the storage model in figure 1.

Direct Attached Storage
Direct Attached Storage (DAS or SAS) in the open systems market is currently
limited to two options. Option 1 is good old SCSI; the second choice
is Fibre Channel. But wait! Isnt fibre channel synonymous with
SAN? Not in a direct-attached, point-to-point configuration, in figure
2.

So, are server-attached storage configurations still viable? Absolutely!
In situations that require high speed, network free, server-to-storage
access, DAS is still an excellent alternative. Another place DAS makes
sense is for sites that are budget constrained and do not accept the
lower Total Cost of Ownership (TCO) and Return on Investment (ROI) that
networked storage offers. DAS is also a good choice for remote locations
that have a small number of servers with light user loads. Finally,
some peripherals such as tape drives do not offer an FC interface, so
unless a SCSI-to-Fibre Channel router is purchased, DAS may be the only
viable option.
So when do you use SCSI vs. Fibre Channel as the interface to a DAS
configuration? The answers depend on your technology game plan.
Some sites are slow adopters of technology, and they are most comfortable
with doing things the tried and true way they have always
been done. These are generally smaller organizations or departments,
but may have the need for some large capacity storage. In these instances
a SCSI based solution is best. Fibre Channel DAS, on the other hand,
makes sense for customers that have a SAN in mind for the future and
want to ease into the technology slowly. Fibre Channel is also the option
of choice for attaching multiple servers to a shared enterprise disk
RAID system in a multi-hosted point-to-point configuration. Figure 3
is an example of this design.

Whether SCSI or FC interfaces are employed, a StorNet study found that
about 85 percent of all storage deployed at our customers is DAS.
Networked Storage
Once a potential user has accepted the limitations of a DAS storage
model, and decides to go with a networked storage alternative, things
get interesting in a hurry. Once again there are currently only two
methods of designing and configuring a storage network, NAS or SAN.
Which one is best depends on the organization or sites needs.
Regardless, the benefits of implementing a storage network are immediate
and obvious.
Put another way, the options for implementing storage networks are really
quite simple. A potential user can choose to use either a file-level
based, shared storage resource such as a NAS solution, or a high performance,
switched fabric block-level approach such as SAN. Figure 4 will expand
on these views of networked storage.

If we then look at storage networks as being inclusive of both NAS and
SAN, why do storage vendors make it an either/or decision? Therein lies
the question that has potential end-users scratching their heads and
wondering what to do next. And rather than make a decision, many companies
simply continue buying DAS from their server vendors. This is unfortunate,
as they lose the opportunity to implement an improved enterprise storage
solution.
There is, however, an alternative. An experienced storage solutions
and services integrator can be consulted even if it is only for
the service of an assessment and design document to demonstrate the
pros and cons of the different storage options and configuration choices.
Network Attached Storage
The standards for NAS are strong standards indeed. There are two networking
standards for accessing networked attached data. The Network File System
(NFS) is the de facto standard for the UNIX community, and the Common
Internet File System (CIFS) is the standard for all flavors of the Windows
Operating System. NAS devices provide the ability to support true file
sharing between NFS and CIFS servers.
In a NAS configuration, the actual file system is resident on the NAS
device itself, freeing up the CPU of the application server from having
to manage the I/O associated with a file system. In a nutshell, NAS
servers off-load all of the functions of organizing and accessing all
directories and managing data on disk, as well as managing the cache.
NAS can also be employed to consolidate file-serving applications from
distributed UNIX and Windows NT servers to a single NAS platform.
Another ideal application for NAS is in technical engineering applications
such as geoseismic or pharmaceutical applications. These are environments
where multiple engineers or researchers may simultaneously access a
large file or group of files. Software development, document imaging,
and CAD/CAM design are all good places to recommend a NAS solution.
In summary, NAS is the best choice for UNIX and Windows NT data sharing
applications, consolidated file service applications, technical and
scientific applications, and other file-based storage needs.
NAS Choices
The appliance model of NAS (a.k.a. filer)
An appliance is a device that performs a single function very well.
A popular and accelerating trend in networking has been to use appliances
instead of general-purpose computers to provide common services. For
instance, special-purpose routers from companies like Cisco Systems
and Nortel Networks have almost entirely replaced general-purpose computers
for packet routing, even though general-purpose computers originally
handled all routing functions. Similarly, modern printers are more likely
to plug into the network than into a general-purpose computer. Other
examples of network appliances include network terminal concentrators,
network FAX servers, and network backup servers.
Appliances have been successful because they are easier to use, more
reliable, and have better price/performance benefits than general-purpose
computers. These benefits arise because appliances can be optimized
specifically for their single function, without the compromises necessary
to meet the many conflicting requirements of a general-purpose system.
For example, a typical NAS appliance may have less than 50 instructions
within its Operating System and not include any proprietary hardware
but instead use off-the-shelf popular components. Thus, NAS filers can
be quickly added to existing networks.
Additional individual appliance boxes can be added as necessary
according to storage needs, without the hassles of having to upgrade
the general purpose server or DAS; this is another reason why network
administrators have embraced the NAS concept.
Network Appliance is the current leader in NAS and their network storage
appliance (a.k.a. filer) brings the advantages of an appliance to the
Windows and UNIX market. Filers cannot run applications and do not run
a general-purpose operating system like UNIX or Windows NT. Filers feature
ease of use and price/performance, and upward scalability that cannot
be matched.
However, a filer is designed to have a single brain (CPU) and large
amounts of disk capacity behind it. This approach has created some scalability
and failure issues that have been addressed with a newer clustered NAS
approach.
Clustered NAS
Clustered NAS is similar to the filer approach in most respects except
one, scalability. Rather than a single CPU with large amounts of disk
space behind it, clustered NAS offers relatively small chunks of storage
capacity, each with its own processor. These storage blocks
can then be connected together much like Lego blocks (see figure 5).

The benefit of this approach is that as capacity is added to the NAS
pool, incremental processing, cache, and connectivity is also added.
The end result is high scalability without sacrificing performance.
This approach is quickly gaining enthusiastic support.
Storage Area Networks
Often seen as competing technologies, in reality, SAN and NAS complement
each other very well to provide access to different types of data. SANs
are optimized for high-volume, block-oriented data transfers, while
NAS is designed to provide data access at a file level. Both technologies
satisfy the need to remove direct storage-to-server connections to facilitate
more flexible storage access.
A storage area network (SAN) is a high-performance subnet, based on
fibre channel or IP, whose primary purpose is the transfer of data between
computer systems and storage devices, and among multiple storage elements
(e.g., direct disk-to-tape transfer). One can think of a SAN as an extended
and shared storage bus. A SAN consists of a communication infrastructure,
which provides physical connections, and a management layer, which organizes
the connections, storage elements, and computer systems so that data
transfer is secure and robust. While there is debate among industry
insiders, a switch is generally required in the configuration to qualify
as a SAN. Until recently, the only viable means of switching data paths
to a storage device was through a Fibre Channel switch. However, the
emergence of IP Storage protocols such as iSCSI and SoIP has extended
this capability to traditional IP networking switches as well.
Because SANs are optimized to transfer large blocks of data between
servers and storage devices, they are ideal for applications such as:
Mission-critical database applications where predictable
response time, availability, and scalability are essential
Centralized storage backups where performance, data integrity,
and reliability ensure that critical data is secure
High-availability and application failover environments
to ensure very high levels of application availability at reduced costs
Scaleable storage virtualization which detaches storage
from direct host attachments and enables dynamic storage allocation
from a centralized pool
Improved disaster tolerance which provides high performance
over extended distance between host server and connected devices
Fibre Channel vs. IP Networked
SAN
Once the decision has been made to implement a SAN to reap the maximum
performance benefits of networked storage, the next step is to decide
whether it will be an FC-based SAN, an IP-based SAN, or a combination
of both. Note that the term IP Storage is used as opposed to iSCSI or
SOIP. This is because there is still no standard in place that precludes
the use of either of these protocol choices, although iSCSI has the
lions share of attention in the market today.
As heated as the debate is between the NAS and SAN vendors, it pales
in comparison to the rhetoric surrounding which topology is best suited
to implement a SAN. Figure 6 shows SAN option choices.

The Case For Fibre Channel
Fibre Channel was designed specially to address server-to-storage interface
limitations. At SAN sites, Fibre Channel interfaces are delivering measurable
operational benefits not previously possible with standard connections,
such as Direct Attached SCSI. For example, by connecting RAID to the
backend of a server over a Fibre Channel bus, higher bandwidth results
in quicker I/O transfers over longer distances than is possible with
a standard interface. The RAID and Fibre Channel combination team up
to improve storage subsystem reliability through fault-tolerant storage
array operations and redundant pipeline data paths.
In an FC-AL (Arbitrated Loop) based SAN, up to 126 nodes can be connected
per loop. Multiple loops can be added as needed. Switched-based SANs
provide unlimited scalability. This modular scaling capability provides
a sound infrastructure for long-term growth. The fibre communications
channel supports multiple protocols and has a current bandwidth limitation
of 200 MB/second. The FC interface can sustain this bandwidth up to
10 kilometers. Each storage unit on a FC network is a peer node, allowing
for the flexibility of direct storage device-to-storage device communications
via either arbitrated loop or switched fabric.
Most Fibre Channel devices are dual ported. Using both ports in a dual-loop
configuration provides a redundant path to/from the device, guaranteeing
access should one path fail. This high-availability configuration is
ideal for mission-critical applications. FC interfaces provide the performance
required to meet an array of bandwidth intensive storage management
functions like backup, remote vaulting, and hierarchical storage.
Fibre Channel switches and hubs provide for simplified storage device
scalability, hot plugging of storage devices, and isolation between
functions. This translates into easily scaleable bandwidth and improved
subsystem availability.
The Case For Internet Protocol
Fibre Channels shortcomings are that it requires new skill sets
to be learned for building and managing the storage component, and the
price per SAN port is up to five times that of standard Ethernet IP
ports. IP storage developers make use of the existing network infrastructure
and capacity, thus eliminating the need for new expertise (training)
while holding down additional SAN port costs.
The Internet SCSI (iSCSI) IP protocol stores/retrieves data to/from
any SCSI storage device over an Ethernet port, which connects to the
existing IP core infrastructure. Alternatively, a second port, which
is Fibre Channel wired, could be used to connect directly to a storage
device (or to a switch for connectivity to a storage device). iSCSI
is economical because it maximizes the existing site networking infrastructure
and requires no additional storage management training.
Storage over IP (SoIP), a competing IP protocol, combines wire-speed
Gigabit Ethernet performance with support for SCSI, Fibre Channel, iSCSI,
and all types of NAS storage interfaces. This interoperability enables
the building of standards-based, manageable IP storage networks. SoIP
reduces the CPU overhead commonly associated with iSCSIs creating
and reassembling of the TCP/IP packets by replacing the server-based
drivers with an in-switch conversion process.
Summary
The foregoing storage discussion is but the tip of the iceberg when
making an enterprise data storage and protection storage management
decision. The storage networking debate is not an either/or issue. The
answer is it depends. It depends on your company and site,
your application, and your budget. Figure 7 shows the storage options
that are available, and is an excellent illustration for presenting
storage alternatives to your organization.

Derek Gamrudt is the chief technology officer at StorNet, Inc. (www.stornet.com).
Gamradt joined the company in 1990 and has extensive knowledge of all
aspects of storage management.
To comment on this article, go
to 1502-02 at www.drj.com/feedback.
|