[CentOS] Filesystem that doesn't store duplicate data

Ruslan Sivak rsivak at istandfor.com
Thu Dec 6 04:57:42 UTC 2007


Luke Dudney wrote:
> NetApp's WAFL with A-SIS (advanced single instance storage) does this. 
> From a quick google:
>
> http://searchstorage.techtarget.com/originalContent/0,289142,sid5_gci1255018,00.html 
> says:
> ...  calculates a 16-bit checksum for each block of data it stores. 
> For data deduplication, the hashes are pulled into a database and 
> "redundancy candidates" that look similar are identified. Those blocks 
> are then compared bit by bit, and if they are identical, the new block 
> is discarded.
>
> The pre-sales engineer I spoke to regarding this said that it's not 
> done on demand but rather by a periodic background process. It's 
> pitched for backup and archiving functions. If you have NetApp kit it 
> can apply this to any of your data on the Filer, be it via CIFS, NFS, 
> FC or iSCSI.
>
> While this isn't available on Linux it proves that there is market 
> demand for it, that it can be done and probably also appears to some 
> kernel hackers as a challenge...
>
> cheers
> Luke
>
Yea, I originally got the idea from the NetApp marketing materials.  
Would be cool if this was available for free for linux. 

Russ



More information about the CentOS mailing list