When SOA concepts are applied to the world of disaster recovery, some interesting possibilities begin to emerge. What if disaster recovery “services” could be architected to be available on-demand? For example, what if an IT organization could subsequently guarantee a recovery time objective (RTO) of 15 minutes or less – for any data or application, on any server or storage platform, at any location, regardless of the size of the data set? What if certain backup, replication or recovery functions could be quickly enabled or activated anywhere on a company’s local or extended network – whenever and wherever they were needed?
If this seems like a utopian ideal not bound by the reality of today’s cost or implementation constraints, think again. This article describes a paradigm shift for DR that involves the delivery of on-demand disaster recovery services via a common, vendor-agnostic “services-oriented platform.”
Combining the tenets of SOA with technological advances, such a platform allows critical data protection and recovery services to be offered in an independent services layer that resides above underlying servers and storage systems. It can also improve business continuity-related service goals for RTO and RPO at the lowest possible cost.
Departing from the Old Way
Traditionally, IT organizations rolling out different phases of a disaster recovery initiative delivered protection and recovery services only by combining a point solution from one vendor with other point solutions from other vendors.
If you needed advanced data replication, this functionality usually came as part of a high-performance storage array purchased from a particular vendor. Needed security or data encryption? This would come from another vendor’s point product. Likewise, continuous data protection (CDP) functionality – a relatively new addition to the DR toolset – might come from a third solution.
One challenge with this model, however, is the growing cost and complexity involved in managing or scaling such disparate solutions, let alone tying them together via custom scripts. In the event replication software is tied to a specific array, other costs can emerge as users outgrow the array and are then forced to perform a technology refresh by upgrading to a higher capacity system and planning the ensuing data migration.
While a services-oriented DR architecture could be manually developed by tying such point solutions together to offer a core set of data services, the challenges and cost of doing so can quickly outweigh the potential benefits involved.
SOA and DR Delivered from a Fabric-Level Platform
In order to move from a service-oriented architecture to a more automated, services-oriented platform for disaster recovery, however, a paradigm shift is needed. What has been missing thus far was a simple, yet scalable way to offer core DR-related data services from within the storage network fabric that connects all critical servers with their underlying network storage subsystems. Figure 1 depicts this type of fabric-level delivery of DR services within a centralized network cloud called the “Data Services Engine.”
Technological advances and the growing popularity of virtualization have made it possible for organizations to adopt this type of services-oriented platform to leverage services available from within the fabric, whenever you need them. Providing the intelligence for core data services at the fabric layer rather than within the storage itself or on the host provides multiple benefits including:
- Simplification of the overall infrastructure and standardization of operations
- Higher service quality and the ability to achieve higher service level agreements (SLAs), in terms of both RTO and RPO
- Reductions in overall IT capital and operating expenses
- More unified provisioning of services and storage capacity, as needed
- Ability to provide high levels of availability and functionality across bulk storage
- Protection across heterogeneous storage arrays, often from different vendors
- Closer integration with applications for more consistent, rapid recovery
How achievable are these types of benefits once you make the shift to a services platform? One customer who recently switched to the company’s services-oriented platform was a large financial firm. After the switch, the firm anticipated the following return on investment (ROI) from the move:
- A 1200 percent improvement in recovery times (moving from a 24-hour RTO down to 30 minutes and an RPO that translated into zero data loss)
- A 50 percent reduction in storage costs, resulting in over $2.5 million in savings
- An 80 percent cost reduction in bandwidth needed for DR
- An 80 percent improvement in utilization of existing storage
- In all, the financial firm expects to save over $6.3 million in the first five years of operation over its legacy DR approach.
Requirements of a DR Services-Oriented Platform
For comprehensive disaster recovery, a services-oriented platform should exhibit the following range of features:
- Encryption – reduces the risk of exposure when data is replicated or stored off-site.
- Duplicate data reduction or elimination – reduces bandwidth and the amount of data replicated.
- Compression – reduces bandwidth and theamount of data replicated.
- Optimized management of changed data (deltas) – can also reduce the amount of data replicated.
- Virtualization functionality – allows use of existing or legacy storage equipment and the ability to use more competitive alternatives for future storage.
- Single console DR management/provisioning – makes DR services easier to enable, manage or modify. Such capabilities should include the ability to centrally provision DR services and capacity needed across disparate, underlying platforms.
- Rapid, disk-based “snapshot” functionality – gives administrators the option to quickly mount data drives at a remote site, and more efficiently utilize existing storage for recovery.
- Continuous data protection (CDP) – makes it easier to meet an organization’s most robust RPO.
- Continuous data replication (CDR) – encourages the most immediate replication of deltas (changed data).
- Replication support for storage tiers – supports cost-effective, rapid recovery from primary, Fibre Channel disk systems to lower-cost Serial ATA (SATA) disk systems.
- Reporting features – ensure proper, on-going operation and more streamlined DR testing.
Virtualization: The ‘Secret Sauce’ Behind an Effective DR Services Platform
The paradigm shift to a delivery-services platform for DR is now possible due largely to new developments in virtualization. Company servers and their respective compute resources have already begun to embrace their own virtual revolution for DR – thanks to software that allows multiple ‘virtual machines’ to be developed, mirrored, protected and recovered on top of larger physical servers, with a related reduction in required physical footprints.
Now, storage virtualization technologies have similarly matured to offer the scalability and fault-tolerance needed to support a robust platform of core DR-related data services – all available from within the storage network “cloud” or fabric layer.
In the past, organizations have been reticent to deploy storage virtualization technology en masse. Many virtualization solutions seemed to require complex migration of a company’s existing data assets into a virtualization layer without much chance to revert to the old way if the company wasn’t happy with the results of such a move. Still others engineered virtualization and its outcropping of DR services via the use of large data caches, which also came with its own set of challenges to continuous, streamlined operations and availability. Figure 2 illustrates some of these limitations and challenges.
A new breed of storage virtualization technology has since begun to emerge. Some solutions now offer robust virtualization platforms that allow data to be manipulated as part of a virtualized pool – all while it remains in its native format, on native storage devices. No data migration is required. In the event a company wants to bring its data back to a pre-virtualized state, a simple change to a fabric zone set or mapping table can accomplish the task. No caching is used. DR services remain constantly available from within the fabric-based, virtualized layer.
Figure 3 shows the capabilities of emerging storage virtualization solutions that offer clustered, fabric-based “nodes” with virtualization functionality and related DR services provided as part of the transparent, virtualized abstraction layer. These all operate as a normal part of the data path between servers and their underlying storage devices. With the ability to push data between servers and storage at as much as 32GB per second and automatic failover to other nodes in the event one node goes down, such technologies now allow recovery services like CDP or CDR to be quickly “turned on” as a service for each critical application, not to mention the application’s larger consistency groups of interrelated upstream or downstream data.
The result marries such technological advances with a more holistic approach toward the delivery of data provisioning, data protection, replication and recovery as a set of vendor-agnostic, independent services. Rather than limiting choice and increasing business continuity costs, such a marriage is able to drive simplicity, reduce connectivity costs and open the door to less-expensive options for both hardware and software.
Chris Poelker is vice president of enterprise solutions for FalconStor Software and spends most his time with large Fortune 1000 companies defining strategy for virtualization and business continuity solutions. Prior to FalconStor, Poelker lead senior systems storage architect Hitachi Data Systems and Compaq Computer, Inc., in New York. Poelker’s certifications include: MCSE, MCT (Microsoft Trainer), MASE (Compaq Master ASE Storage Architect) and A+ certified (PC Technician). He is a sought-after speaker and writer, including author of “SAN for Dummies.”
"Appeared in DRJ's Summer 2008 Issue"