Fall World 2013

Conference & Exhibit

Attend The #1 BC/DR Event!

Spring Journal

Volume 26, Issue 2

Full Contents Now Available!

The Key to ILM is Cost-Effective Backup

Written by  Eran Farajun Monday, 19 November 2007 20:27

Information lifecycle management (ILM) is a relatively young concept and the industry will continue to encounter pitfalls as ILM solutions develop. As it stands at the moment, ILM has two aims: to reduce administration costs and to make the most efficient use of storage hardware. But in order to achieve these, ILM needs to rely on an ILM-aware backup system. A backup architecture lacking the ability to maximize the use of ILM will reduce any chance of successfully realizing the goals of cost reduction and greater storage efficiency.

Why The Need For ILM

Reliable and secure data storage is crucial to business continuity plans. With the dependence enterprises have on information about their technical processes, data storage is becoming somewhat of a headache for IT executives and storage administrators. Many industries such as finance and healthcare are facing new regulations that require enterprises to conform to evolving regulations about data retention. These requirements, as well as the increasing amount of data that enterprises store, mean that the costs of managing information can grow up to 20 to 30 percent per year. With estimates such as these – the need for cost-effective data storage, and its management, becomes of paramount importance to enterprises and IT managers.

What Is ILM

Definitions of ILM can vary, but ILM will be defined as a data archiving process which moves data automatically to the most cost-effective storage media available and is based on prescribed policies of accessibility, security, and long-term storage. This automatic transferral of data requires no manual intervention; reducing hardware and real estate costs, therefore ILM vendors are able to promise a significant return of investment (ROI).

The data generated by an enterprise can be placed into two categories:

Critical information is the data that is used for day-to-day operations and is located within the enterprise’s primary storage system, allowing for fast access.

Important information is the data that can be archived to secondary storage, typically lower cost disks or tapes at an off-site location. This information is historical, legal, and regulatory.

Critical data is accessed frequently, yet over time a file will be accessed more sporadically, thus the file’s status changes from critical to important. A prescribed policy can also determine a set length of time by which a file ceases to be critical, such as after 90 days. The ILM solution then automatically archives this data to secondary storage, without manual assistance from IT personnel. ILM solutions then create a “pointer” that contains the metadata for every file that has been automatically moved to secondary storage. If the file’s status then ever returns to critical status, the pointer directs the user straight to the file’s new location to be retrieved for use.

The efficacy of ILM can be compared with systems libraries have used to manage the thousands of books in their collections. It is fairly easy and cheap to buy books, yet expensive to manage storage of the books so you know where each book is at any point in time. Additionally, a system needs to be set up to manually manage the movement of these books as well as a system of categorizing the books. As new books are added to the collection (i.e. critical data) they need to be categorized and stored correctly. As books decrease in demand, they are filed away to an archive (i.e. important data). An ILM system would automatically categorize and store the new data books accordingly, as well as re-shelve the low-demand books elsewhere, therefore negating the need for such time-consuming management.

Where Does A Problem Arise With Backup?

Enterprises are recognizing through media “hype” surrounding ILM that it is something worth investing in, and are quite rightly looking to this new concept to improve the efficiency of their data storage management. But in doing so, enterprises can forget to take into account their existing back-up system and fail to ensure that the stored data isn’t duplicated.

The typical architecture of a back-up system saves files from primary (critical status) storage on a low-cost disk or tape on a daily basis. If one given file remains critical, this frequent backing up remains in process.

The ILM archiving of data is distinct from back-up operations as ILM archiving moves the operational, non-critical data into long-term storage, whereas backup protects critical data before it can be archived.

Back-up systems that are not ILM-aware will continue to store backed up files on tape or secondary disks regardless of the data already archived elsewhere. This is an important oversight as both sets of data must now be managed, incurring an increase in costs and reduction in efficiency. The result is a lower return on ILM investment than the IT directors would have expected.

Referring back to the library analogy, this duplication problem could occur if a library decided to ensure that its bestseller books are always available for borrowing and made copies of a bestseller book each time it was loaned out. The benefit of this is that the book is always available. However, once the book is no longer a bestseller (no longer critical data) and all the copies have been returned, the librarian would have to ensure there is space on the archive shelves for all these duplications. Although an important process has been put in place, it has proved costly to the library. In the same way, a non-ILM distributed back-up system can waste valuable storage space.

How To Counteract The Problem

A realistic and efficient solution to this major failing of backup is to implement an ILM-aware backup, such as distributed backup. Distributed backup removes entirely the need for daily backups of critical data onto costly tapes, thereby automatically reducing the level of storage management required by an enterprise.

A distributed back-up system collects the data from the network clients and sends it to offsite disk storage in a compressed and encrypted format. When the data is needed for a restore, the system will retrieve the data as required. The process is fully automated and ensures fast and multiple backup without duplicity. The back-up process is efficient and the user can be assured of achieving the anticipated ROI.

This ILM-aware distributed backup makes efficient use of ILM’s archive pointers by retaining one copy of a file on either backup or secondary storage. The pointers enable the backup to decipher which files have been archived and allow it to automatically remove these excess files from the back-up disks. This improves cost-efficiency by removing the problem of file duplication and uneconomical use of storage space.

ILM-aware distributed backup is able to do this by locating and recognizing a given file’s pointer in the back-up data (received from the client) and automatically searches the back-up disk for the original file, deleting it and saving the pointer.

A librarian could use pointers in the same way in order to solve the problem of having to store multiple copies of a bestseller each time one is made. A stamp (a pointer), for example, on the original copy would automatically tell the librarian that any other returned copies of the same book are not this original. The librarian can then discard these excess versions each time they are returned to the library so that they don’t have to find storage space on the shelf for more than one copy. The library’s indexing system will automatically detect the stamp on the original book and ensure that it is shelved accordingly.

This system means that current data in primary storage is backed up to disk, minimizing disk size and cost. Distributed backup results in faster, more frequent backups and simpler restore operations, while reducing hardware and storage costs and the necessity for daily administration.

It is important to realize that the life of a backup-file is separate and distinct at whatever stage of life it is at: from when it is born; to when it is kept on different tiers of storage media; to when the backup-file is deleted.


 

Eran Farajun is executive vice president for Asigra Inc., the multi-site backup/recovery specialist. His role at Asigra includes marketing and strategic business development. Farajun holds a law degree from the University of Sheffield in the UK.

Login to post comments