Avoiding the disk bottleneck in the Data Domain deduplication file system
Remove 99% of the disk accesses for deduplication of real world workloads
EMC | 26 April 2010, 16:52 | Storage | View Preview
A significant challenge with deduplication technology is to identify and eliminate duplicate data segments on a low-cost system that cannot afford enough RAM to store an index of the stored segments and may be forced to access an on-disk index for every input segment. This highly technical paper describes three techniques employed in the production Data Domain deduplication file system to relieve the disk bottleneck.




