[CentOS] Filesystem that doesn't store duplicate data
Ruslan Sivak
rsivak at istandfor.com
Thu Dec 6 04:57:42 UTC 2007
Luke Dudney wrote:
> NetApp's WAFL with A-SIS (advanced single instance storage) does this.
> From a quick google:
>
> http://searchstorage.techtarget.com/originalContent/0,289142,sid5_gci1255018,00.html
> says:
> ... calculates a 16-bit checksum for each block of data it stores.
> For data deduplication, the hashes are pulled into a database and
> "redundancy candidates" that look similar are identified. Those blocks
> are then compared bit by bit, and if they are identical, the new block
> is discarded.
>
> The pre-sales engineer I spoke to regarding this said that it's not
> done on demand but rather by a periodic background process. It's
> pitched for backup and archiving functions. If you have NetApp kit it
> can apply this to any of your data on the Filer, be it via CIFS, NFS,
> FC or iSCSI.
>
> While this isn't available on Linux it proves that there is market
> demand for it, that it can be done and probably also appears to some
> kernel hackers as a challenge...
>
> cheers
> Luke
>
Yea, I originally got the idea from the NetApp marketing materials.
Would be cool if this was available for free for linux.
Russ
More information about the CentOS
mailing list