on 12/5/2007 8:57 PM Ruslan Sivak spake the following: > Luke Dudney wrote: >> NetApp's WAFL with A-SIS (advanced single instance storage) does this. >> From a quick google: >> >> http://searchstorage.techtarget.com/originalContent/0,289142,sid5_gci1255018,00.html >> says: >> ... calculates a 16-bit checksum for each block of data it stores. >> For data deduplication, the hashes are pulled into a database and >> "redundancy candidates" that look similar are identified. Those blocks >> are then compared bit by bit, and if they are identical, the new block >> is discarded. >> >> The pre-sales engineer I spoke to regarding this said that it's not >> done on demand but rather by a periodic background process. It's >> pitched for backup and archiving functions. If you have NetApp kit it >> can apply this to any of your data on the Filer, be it via CIFS, NFS, >> FC or iSCSI. >> >> While this isn't available on Linux it proves that there is market >> demand for it, that it can be done and probably also appears to some >> kernel hackers as a challenge... >> >> cheers >> Luke >> > Yea, I originally got the idea from the NetApp marketing materials. > Would be cool if this was available for free for linux. > Russ But the netapp appliance has a processor that is only doing so much. It isn't doing any other tasks and has lots of free time to handle the work. And I wouldn't be suprised if there were a few ASIC's or PLC's doing much of the checksumming and block compares. -- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't!!!!