The timing is right for data de-duplication, especially in remote office environments and for virtual software deployments. First, data growth – whether it’s located in data centers or remotely – is growing exponentially. Second, the cost of disk storage has decreased to the point where it now costs less than tape when implemented with data de-duplication. And third, a very bright legal spotlight is now being cast on companies that transport tapes back and forth for purposes of disaster recovery.
Given this increased scrutiny, it is highly ironic – and seemingly illogical – that large, sophisticated enterprises that are so dependent on electronic data still gamble on trucking their tapes to offsite locations.
Not All De-Duplication is the Same Shape, Size and Color
Depending on your environment and current needs, there are a few ways of leveraging dedupe within your information infrastructure. One way to leverage deduplication is to implement it globally, across all clients at the source of the data (within the systems being protected). This makes data deduplication very broadly distributed and powerful.
Global de-duplication creates an impressive 300-to-1 reduction in the amount of data needed to be moved across the network from primary systems for data protection. By filtering out redundant data segments at the source through the use of intelligent, deduplicating backup agents, customers can dramatically reduces the required network bandwidth and backup storage across WANs reducing storage and operational costs.
This approach is ideal for remote and branch offices, because it allows companies to send much smaller amounts of data and use much less bandwidth. It’s also ideal for virtual software deployments, where consolidation of multiple virtual servers onto a single physical server can challenge traditional backup solutions.
Data deduplication can also be implemented at the device level or target. In this environment, it can be implemented at the back end after the full backup and incremental backups travel across the network. This approach offers some benefits, but it does not take full advantage of the available benefits of a more globally distributed architecture – allowing duplicate data to travel across the WAN or LAN.
About The Author: Jedidiah Yueh is a vice president of product management for EMC Corporation. Prior to EMC’s acquisition of Avamar in November 2006, Yueh co-founded Avamar Technologies in 1999. Yueh graduated Phi Beta Kappa, magna cum laude with an AB from Harvard University. He received the John Harvard Scholarship and a nomination by the president and the fellows of Harvard University for the Rhodes Scholarship. In 1992, Yueh was designated a US Presidential Scholar under former President George H. Bush, one of the highest academic achievements given nationally.
"Appeared in DRJ's Spring 2008 Issue"