Luke Dudney wrote: > NetApp's WAFL with A-SIS (advanced single instance storage) does this. > From a quick google: > > http://searchstorage.techtarget.com/originalContent/0,289142,sid5_gci1255018,00.html > says: > ... calculates a 16-bit checksum for each block of data it stores. > For data deduplication, the hashes are pulled into a database and > "redundancy candidates" that look similar are identified. Those blocks > are then compared bit by bit, and if they are identical, the new block > is discarded. > > The pre-sales engineer I spoke to regarding this said that it's not > done on demand but rather by a periodic background process. It's > pitched for backup and archiving functions. If you have NetApp kit it > can apply this to any of your data on the Filer, be it via CIFS, NFS, > FC or iSCSI. > > While this isn't available on Linux it proves that there is market > demand for it, that it can be done and probably also appears to some > kernel hackers as a challenge... > > cheers > Luke > Yea, I originally got the idea from the NetApp marketing materials. Would be cool if this was available for free for linux. Russ