According to a recent IDC survey, data center managers expect to allocate nearly 50 percent of their budgets to running services in the cloud (public and private) by 2013. As this cloud revolution continues, it’s also becoming increasingly clear that more and more applications will be delivered "as a service."
But can everything be as a service? Obviously SaaS, IaaS and PaaS are the veterans on the as a Service team. At the same time, the diversity in cloud offerings is only accelerating. Some of the new recruits: metal as a service (MaaS), mobile backend (BaaS), and lately we’ve been hearing a lot about disaster recovery (DRaaS). Disaster recovery as a service (DRaaS) is particularly interesting as it helps IT address many of its biggest challenges. Because of this, DRaaS is a natural for cloud computing and is rapidly becoming the killer app for the cloud – with service providers, IT resellers and start-ups all jumping on board.
The reason for the hype is that every business wants to reduce the impact of downtime when a disaster happens. IDC finds that in today’s business environment, the impact of outages is greater than ever. Over 60 percent of executives stated that downtime has a significant impact on their business performance.
Industry experts find the cause of downtime can be traced to a variety of factors – simple human error, natural events, manmade events or failures in technology. Regardless of the reason, the result is the same – the applications/services you need are not available when you need them. And that hurts. The cost of downtime ranges widely. On the high end, lost revenue for a brokerage is between $6-7 million per hour while outage for retail services are estimated between $70,000 and $100,000 per hour. In either case the point is clear – even a few hours of downtime in business critical applications has a meaningful impact on a company. We can learn from that study that downtime has a huge impact on the revenue growth and profitability of companies, as well as lost productivity.
When a disaster strikes, it is clear that there are serious repercussions to the business. In addition, there could be temporary or even permanent loss of critical data. Offsite, comprehensive and regular data back-ups are usually the first step in an IT continuity plan, but in today’s 24x7 business world, protecting against data loss is not good enough.
In addition to data loss, disasters may also lead to the permanent loss of physical infrastructure including IT infrastructure. This loss may result in the inability to fulfill existing or new orders. In most cases the loss of key IT applications or services can have as negative an impact on the business as the loss of data. In fact, according to the National Archives & Records Administration in Washington, 93 percent of companies that lost their data center for 10 days or more due to a disaster, filed for bankruptcy within one year of the disaster. To be fully protected, organizations must not only have a plan to quickly restore their data but also the underlying server capacity and the business services those servers support.
At the same time, organizations are looking for a more cost-effective way to implement an easy-to-use disaster recovery service. A Forrester study found that business continuity was a top five priority for the office of the CIO in 2012. While DR may be the top priority, it can be capitally intensive. With a traditional approach, companies would not only require a secondary data center, but also a complete replica of the IT environment. Costs can run into the millions – if not tens of millions – and for many companies, those expenses are simply too high. With that in mind, and coupled with the impact data loss and downtime has on organizations, DRaaS becomes a viable – if not essential – option.
Benefits of DRaaS: Why Disaster Recovery in the Cloud?
With traditional DR, the DR site has to be in lock step with the primary site. This means the server on the DR site has exactly the same configuration, BIOS, drivers, etc., as the physical server you are trying to recover at the production site. This is difficult to do unless you buy both servers at exactly the same time. And even if you do, configuration drift is natural, as any change in primary site needs to be a mirror image to the other. Failure to keep this perfect lock-step greatly increases the chance of an outage. With DRaaS, the differences between the primary and secondary site can be abstracted (via technologies like converged infrastructure) and so customers can be freed from the synchronization handcuffs.
Disaster recovery has always been important to businesses. But as stated above, businesses needed to spend a large amount of capital costs to own the infrastructure that support DR services. Once freed from this management nightmare, there is fewer and fewer reasons companies need to own their DR infrastructure. This opens the door to cloud and with DRaaS, all the upfront investment is removed. Customers can access computing only when they need it and pay on a monthly basis for what they use – typically this is just a fraction of the cost of a dedicated DR.
Simpler to Test
The costs of traditional recovery testing and exercising often constitute a significant portion of the annual disaster recovery budget (Gartner estimates this is $100,000 or more per exercise). As DR in the cloud frees up tremendous amounts of time (mainly as DRaaS does not have the same synchronization requirements as with traditional DR), as such customers can test their plans more frequently – going from annual test to quarterly or even quarterly to monthly. This results in better predictability and greater likelihood of a fast and successful recovery – always the most metric.
Minimizing the amount of downtime is now seen as a competitive advantage. Businesses strive to have the least amount of downtime, and as we’ve seen recently, public outages have a detrimental effect on a company’s reputation. If a disaster does take down the infrastructure, fast recovery is critical and can dictate the difference between long-term success or failure. Recovery from a back-up tape that can literally take days to recover isn’t cutting it anymore. With DRaaS, companies can have access to services that aren’t as expensive and still have faster recovery time, usually in a matter of hours or minutes.
Old DR models are manual and are required to keep a physical run book that describes every outage and recovery. They were done at human speed, which was counterproductive as lost time meant lost revenue. An automated process helps speed recovery and maintains accuracy. Many of these positive factors are why companies take advantage of the cloud and virtualization.
Challenges and Best Practices
As compelling as DRaaS is, there are always some important factors to keep in mind and challenges to be aware of. Today’s DR best practices must account for the growing complexity of IT, they must assume control over a diversity of platforms and, most importantly, they must provide guaranteed, fast and verifiable recovery through digital automation of the recovery run book.
Today’s businesses need to determine where DRaaS can fit into the overall recovery portfolio. This is because many companies still have legacy applications that may not work in today’s cloud architectures, as these clouds are designed to host applications as virtual machines. While that works for applications that have already been virtualized, the vast majority of mission critical applications that need protection are running on bare metal hardware. While this can act as a showstopper, this mismatch in requirements is driving cloud to support physical, bare metal applications in DR services.
A New Service
DR in the cloud has only surfaced recently, with conversations focused around the best practices. With new solutions from providers coming out over the past year, it’s even more important to ask and double check service level agreements, guaranteed capacity and the amount of accessible compute capacity.
Security is also a risk in the cloud and something that many companies struggle to trust and feel confident in. It is important to discuss security issues and what steps will be taken to ensure all applications, especially mission critical applications, are safe, secure and protected.
Another important aspect to consider is the location of a provider. As a rule of thumb, you do not want a provider to be located in your backyard. Even though most outages aren’t caused by a disaster, it is still best practice to keep recovery site out of the same region as your primary site.
Even with new DR services in the cloud, businesses still need to practice the basics of DR to ensure the critical baseline of continuity in the data center. They should still test and run recovery exercises, properly train employees and work closely with vendors. No matter what approach businesses take, these basics will help make the transition to DR in the cloud as smooth as possible.
It is also important to tier applications with DRaaS. Distinguishing applications that are mission-critical versus non-critical can help keep costs down and ensure the most valued applications never go down.
Cloud computing is driving organizations to completely rethink their IT investments and strategies. As the technology of cloud evolves and diversifies, concepts like DRaaS are causing a resurgence in business continuity and disaster recovery planning. By 2014, Gartner predicts that 30 percent of midsize companies will have adopted recovery-in-the-cloud to support IT operations. While the stage is certainly set for explosive growth in DRaaS, the industry has a long way to go to provide more choices and customization for organizations looking to use cloud to improve uptime and IT service levels.
One thing is clear, the time is right for both industry and end users to focus and explore all possibilities the cloud can offer.
Pete Manca is the president and CEO of Egenera.