DRJ Fall 2019

Conference & Exhibit

Attend The #1 BC/DR Event!

Summer Journal

Volume 32, Issue 2

Full Contents Now Available!

Wednesday, 11 February 2015 06:00

Out with the Old and in with the New

Written by  Bill Andrews
How backup changes from physical servers to virtualized environments


Andrews-Pic1Backup is often at the bottom of the list for IT teams – an afterthought even – but having an effective back-up and disaster recovery plan in place means considering backup as an important piece of the IT strategy. Over the past two decades, IT departments have hosted dedicated physical servers per application, but as companies move to virtual environments, back-up approaches must also evolve.

For most IT departments, the rotation for physical servers is to do a full backup of email and databases Monday through Thursday; move to disk-only the changed, unstructured data files (incrementals) on Monday through Thursday; and then on Friday do a full backup of all data in the environment.

In some cases, IT departments perform changed-block backups, but the vast majority is a full file incremental during the week with a full backup on Friday. In the physical server world, you only back up the data. In some cases, you may do a complete image-based backup for bare metal restores, but the vast majority of backup is only data.

For more than 20 years, IT departments have only had to worry about one thing when backing up their systems: data. As we move into an increasingly VM-dominated data center, there are many factors that need to be considered.


The move to a virtual environment and what it means:
IT data centers are rapidly moving to a virtualized server world. While virtual servers still use direct attached storage, network attached storage, or storage area network storage, the back-up application no longer backs up just the data. It now backs up the entire virtual machine (VM).

This includes the guest operating system, the application, all associated system files, and the data. When you’re backing up more than just data and preparing it for a recovery point at a later time in the event of a disaster, the way you think about backup also needs to change.

In the past, if a primary server failed for any reason, you would obtain a new server, load the server software or copy an image onto the server, and then restore the most recent data backups to ensure that the most up-to-date data is on the system. In the virtual world, VM backups are typically written to disk. Since the VM is sitting on back-up disk, you can simply boot it from the back-up system.

If the primary systems fail, you can boot a VM off the back-up system, and users can continue to work directly from this. Once the hypervisor is used to make the primary systems operational, the user activity is transparently moved back to the primary systems through a VM instant recovery, allowing your recovery time to be minutes versus hours.

If a third-party auditor is auditing your business continuity, you can boot a VM of the back-up system in order to demonstrate that you have a working copy of the entire VM, including data. In the past, it was almost impossible to show an auditor that you could recover from a failure. In the VM world, you can simply boot the VM off the back-up disk, show the auditor that it is running, and the audit is complete through a “verified” or “sure backup.”

Since VMs are sitting in the back-up system, you can also boot the VM, apply a patch, and perform tests in a “virtual lab” before rolling it out in the primary virtualized environment to minimize risk.


The critical role VMs play in a DR-plagued landscape:

Backing up VMs changes how you recover, pass internal audits, and test outside of the production environment. Changing backing up just the data to backing up the entire VM also changes your back-up infrastructure and process. With the need to boot VMs for instant recoveries, boot VMs for auditors, boot VMs to test software changes and updates outside of production, and the need to perform weekly synthetic fulls, disk is required in the virtualized back-up world since you cannot boot from tape and you cannot easily read/write with tape.

Andrews-Pic2For disaster recovery instead of building servers, loading system/application, and then recovering the data, the only thing required is to simply boot the VM. The VM has everything needed, including data, to make the replacement systems production-ready.

In virtualized environments, the changed storage blocks are tracked by the hypervisor via changed block tracking (CBT). The back-up application picks up the changed blocks and copies them to the back-up storage target. Unlike traditional physical server backup where a full backup is performed every Friday night, in the virtualized world each backup is the changed blocks only. There are back-up window benefits to only backing up changed blocks, but there is also risk to keeping only changed block backups. If you retain too many CBT backups, the time to restore is painful. In addition, if a block is damaged or corrupted anywhere in the chain, reconstituting a full backup will fail. To overcome this, IT must create a full backup, sometimes called a “synthetic full” backup, at least once a week.

With the need to boot VMs for instant recoveries, boot VMs for auditors, boot VMs to test software changes and updates outside of production, and the need to perform weekly synthetic fulls, disk is required in the virtualized back-up world because you cannot boot from tape and cannot easily read/write with tape.

With CBT, most VM back-up applications with a weekly synthetic full will see a storage reduction of 2:1 to as much as 6:1. As the retention period grows, so does the disk storage. With retention periods of four to six copies or greater, the amount of disk storage required becomes quite costly very quickly. Therefore, for four to six copies of retention or less, straight disk can be used. With a larger number of copies, disk-based back-up appliances with data deduplication are required. Data deduplication appliances can raise the rate of deduplication in a virtualized environment to as much as 20:1. As a result, far less disk is used, and the cost to store a larger number of copies for retention is far less using dedicated appliances with data deduplication versus using straight disk.

Using a disk appliance with data deduplication requires a high-speed disk cache in order to store full version VMs in order to be able to boot for many scenarios or to be able to easily perform a synthetic full backup. There are two types of disk-based backup appliances.

  1. The first deduplicates the data inline, which means deduplication occurs on the way to disk and therefore only stores deduplicated data. In order to be able to boot a VM, you need to put a straight disk storage cache in front of the appliance in order to have full VMs ready to boot and then have the longer-term deduplicated storage in the appliance.

  2. The second type of appliance has both built in to a single integrated appliance. These appliances have a disk cache or “landing zone” in order to maintain the most recent VMs in their full form ready to be booted or restored and then store all the deduplicated data behind that.

Over the years, backup has changed from backing up data to backing up complete virtualized machines with a critical acknowledgement of how DR affects its data center infrastructure. All in all, the move from physical servers to virtualized environments is changing how you prepare, set up, and store backups.


Andrews-BillBill Andrews is the CEO of ExaGrid Systems. Andrews has written three books on backup:

Straight Talk About Disk-based Backup with Data Deduplication”

Straight Talk About The Cloud for Data Backup and Disaster Recovery”

Straight Talk About Primary Storage Snapshots & Traditional Backups”