The potential for disaster is fairly easy to calculate if your data center is in line with Hurricane Alley. During this Hurricane Season and future ones to come, tropical storms and hurricanes will thunder through Florida, barrel up the East Coast or crash onto the shores of the Gulf States, bringing with them floods, power outages and wind damage.
Until recently, the ultimate solution to a significant natural disaster was to build an offsite disaster recovery center. But a dedicated disaster recovery facility is not for the faint of wallet, with its ultra-high-speed link from your data center to the DR facility, and provisioned with duplicate computing equipment and added staff.
Fortunately, three things have aligned in the IT universe to make it much easier for small and medium-size businesses to achieve the same level of data protection enjoyed by big organizations for a reasonable investment. They are disk imaging, virtualization and the one kind of cloud that even hurricane survivors can appreciate, cloud services. And it’s just in time, as most organizations have customer and partner-facing applications that must be available non-stop, often 24 hours a day.
Imaging is the key to fast recoveries
Tape-based disaster recovery is quickly becoming a thing of the past, spurred by off-the-cliff falls in hard disk drive pricing, the sheer number of servers that even small organizations have to manage and the ascendancy in the last three-five years of highly reliable disk imaging-based disaster recovery solutions. Customer acceptance of disk-based solutions over tape-based solutions are also the outgrowth of compliance requirements for privacy protection and data retention (Both HIPPA and Sarbanes-Oxley come to mind) which pressure organizations to keep more new and old data immediately available and online for a period of years at a time. Even at its very best, tape-based disaster recovery still takes hours, often days, to rebuild servers and to retrieve archived data from tape. When you multiply the burgeoning number of physical and virtual servers in a typical data center, ranging from 10-20 in a small facility to thousands in a larger organization, a full and swift recovery on a machine-by machine basis using tape becomes nightmarish to consider, if not impossible to accomplish. That’s where image-based disaster recovery is a relative miracle, offering recoveries in minutes or hours rather than many hours or days, and even instant recoveries for mission-critical applications, all at a reasonable cost.
But for all their promise, not all disk imaging-based disaster recovery systems are created equal. For instance, it’s no longer necessary to take a machine offline, however briefly, to image it for backup, as some solutions require, when instead you can take a snapshot of a live machine. This pays dividends, maintaining data currency on fast-moving customer facing applications like Microsoft Exchange and many SQL applications that can’t lose an hour or two worth of data before the loss impacts revenues and operations.
While imaging technology has fast become a keystone for a successful disaster recovery plan, the manner of its employment has been changing dramatically. To illustrate, take the actual case of a New Orleans area petroleum lubricants company when Hurricane Katrina threatened power outages at the company’s data center in 2005. At the time, the fastest path to recovery was to manually take live images of about 20 servers, load them all on RAID drives and drive them more than 300 miles to another company office in Houston. Upon arrival they were individually restored onto newly purchased physical servers, storage systems, ancillary equipment and networking gear. Each server took just an hour to restore, an astonishing feat at the time, but the total time (including travel to the offsite location) was more than 36 hours.
Add in virtualization
Contrast this to a similar actual event three years later, when 2008’s Hurricane Gustav threatened a Louisiana fluids instrument manufacturer. This time, instead of physically transporting the disk images to another location, they contacted an online co-location company and networked the images from the company’s physical servers over a virtual private network to the co-location company’s Houston data center. There, the servers were restored as virtual machines on powerful host processors, tested and ready to take over in under an hour. The task was completed only moments before the Louisiana data center went off line. Not only was the transition to the offsite location completed in just a few hours, the customer estimates that it cost eight to 10 times less than it would have three years earlier.
Virtualization is no longer just a way to save money on server resources. It can also save money in recoveries by leveraging your organization’s ability to achieve high-speed disaster recoveries and take advantage of rapidly evolving cloud services offerings. It allows you to create standby disaster recovery resources that cost little to maintain as readily launched standby devices. You may begin your use of cloud services for disaster recovery by using it as a low-cost repository for your files and for images of your mission-critical servers. But you can quickly expand its use, especially for mission-critical applications by replicating whole systems in virtual form in the cloud where they can be launched for instant recoveries if disaster strikes.
Cloud services open the door for many more data centers to create a functional duplicate of a costly dedicated disaster recovery site for a fraction of the cost. And the cloud is gaining acceptance fast. According to CloudBzz, which tracks cloud computing worldwide, the market for cloud services will reach $10.5 billion in 2010, billowing to $34.1 billion in 2014. [http://www.cloudbzz.com, June 22, 2010].
How to take advantage of this brave new world of less-costly, yet iron-clad disaster recoveries in the cloud? Start by planning. Inventory your current servers to create the baseline on which you’ll need to plan accurately for the future. Identify projects you expect to implement in the next six to 12 months, including any plans for virtualization. Ask yourself, will any of these changes require you to modify your current disaster recovery plans to accommodate them? Once you’ve answered that, look forward three years and pencil in the changes you would expect to see. Include the possibility that an increasingly large percentage of servers will be elevated onto a higher tier of importance as more applications become customer and partner-facing. Such a change usually translates into faster RTO requirements. Armed with this information, you have a context for evaluating new technologies and functional improvements in backup and recovery products and cloud-based services.
Formalize your recovery time objectives and recovery point objectives
This exercise will ultimately save much more time than is spent in planning. It will clarify your path to recovery and make it much easier to survive both minor and major disaster recovery situations. Begin by identifying the servers that have to be recovered first, and then next, and next. Once that’s settled, you can create recovery plans for each of the three or four levels of servers in your organization. For instance, your most mission-critical servers may be assigned RTOs of 15 minutes or less – and in some cases may be so important that you’ll configure them for instant recoveries. Lower priority servers can be served by less resource-intensive – in other words less costly – recovery mechanisms. By ordering them in this manner, you can fine tune your recovery needs against budget priorities with increased ease.
Move to a more affordable, more responsive disaster recovery solution
Having this planning data in hand is the catalyst for a comprehensive, more affordable DR solution that will be driven in large part by the ability to virtualize servers. Because virtualization has become so prevalent, and its portability has been demonstrated in day-to-day business operations, it makes sense to harness this portability for cloud-based disaster recovery solutions to save money and increase the speed of your recoveries.
In a nutshell, machines can be imaged, transmitted on a virtual private network over LAN accelerators and restored on virtual machines offsite. This can be accomplished for a lot less than was possible even a year ago. That’s because cloud services vendors are just beginning to replicate mission-critical servers as virtual machines, and they can keep costs low by maintaining them on a low-energy, standby basis, rather than powered-up. But because they can be launched immediately, they’re available as quickly as dedicated, replicated machines. Downtime is reduced or eliminated altogether even in the face of a natural disaster.
Test now and be confident later
However you manage an offsite facility, there’s no substitute for testing your recovery plan and making sure it works the way you envisioned when you developed it. Testing your ability to recover smoothly to another location is definitely part of any IT organization’s best practices. It’s relatively easy to back something up, get a replicated server sited offsite on a virtual server host, and store it there on a standby basis. But are you certain it’s recoverable? Think of how much you have invested in the idea that your Exchange server, your domain controller, or your key SQL application will be recoverable in a disaster situation. Then test it to make sure that you can really get it back.
You’ll thank yourself for later for these disaster-defying results:
- You’ll reveal the inevitable holes in your seemingly flawless DR plan in all their glory. But you’ll have caught them before they can blow holes in a real recovery effort, and you’ll have plenty of time to plug them.
- By testing, you’ll have defined and verified a recovery time objective in the most practical way. If you find it takes longer than you anticipated, you can make adjustments to fix it. Moreover, you can confidently forecast your return to productivity to your boss, your co-workers, your customers and your partners. That’s real peace of mind, and possibly a job-saver.
- You’ll be calmer. When you actually come face to face with a disaster, where lots of applications have gone offline or going that way, you won’t stress out (as much)! Practicing for high-stress situations works for police on the beat, emergency medical technicians, and it will work for you. It helps prevent drawing a collective blank when disaster strikes. The calmer you are, the more quickly and surely you’ll recover. It helps prevent drawing a collective blank when disaster strikes. The calmer you are, the more quickly and surely you’ll recover.
The author wishes to acknowledge technical insights on cloud technology offered by iland Internet Solutions
Ismail Azeri is the vice president of business and corporate development at Acronis. Azeri joined Acronis from VMware in in July 2009, where he led the corporate business development team which was responsible for acquisitions, investments, and strategic partnerships. Prior to VMware, Azeri was at EMC in several roles within corporate development and finance. At Acronis, Azeri is responsible for corporate partnerships, Acquisitions, our OEM business as well as playing an integral role in driving corporate strategy. Azeri holds a bachelors degree in finance from Bentley College where he graduated with honors in three years.