Thread: Misc. BSD/UNIX ZFS just got built-in deduplication
View Single Post
  #1   (View Single Post)  
Old 3rd November 2009
vermaden's Avatar
vermaden vermaden is offline
Administrator
 
Join Date: Apr 2008
Location: pl_PL.lodz
Posts: 1,056
Cool ZFS just got built-in deduplication

Quote:
Deduplication is the process of eliminating duplicate copies of data. Dedup is generally either file-level, block-level, or byte-level. Chunks of data -- files, blocks, or byte ranges -- are checksummed using some hash function that uniquely identifies data with very high probability.

(...)

Chunks of data are remembered in a table of some sort that maps the data's checksum to its storage location and reference count. When you store another copy of existing data, instead of allocating new space on disk, the dedup code just increments the reference count on the existing data. When data is highly replicated, which is typical of backup servers, virtual machine images, and source code repositories, deduplication can reduce space consumption not just by percentages, but by multiples.
MORE: http://blogs.sun.com/bonwick/en_US/entry/zfs_dedup
__________________
religions, worst damnation of mankind
"If 386BSD had been available when I started on Linux, Linux would probably never had happened." Linus Torvalds

Linux is not UNIX! Face it! It is not an insult. It is fact: GNU is a recursive acronym for “GNU's Not UNIX”.
vermaden's: links resources deviantart spreadbsd
Reply With Quote