[CentOS] Filesystem that doesn't store duplicate data

Thu Dec 6 13:53:13 UTC 2007
Les Mikesell <lesmikesell at gmail.com>

Peter Arremann wrote:

>> How about a FUSE file system (userland, ie NTFS 3G) that layers
>> on top of any file system that supports hard links
> 
> That would be easy but I can see a few issues with that approach: 
> 
> 1) On file level rather than block level you're going to be much more 
> inefficient. I for one have gigabytes of revisions of files that have changed 
> a little between each file. 

That is a problem for the way backuppc stores things - but at least it 
can compress the files.

> 2) You have to write all datablocks to disk and then erase them again if you 
> find a match. That will slow you down and create some weird behavior. I.e. 
> you know the FS shouldn't store duplicate data, yet you can't use cp to copy 
> a 10G file if only 9G are free. If you copy a 8G file, you see the usage 
> increase till only 1G is free, then when your app closes the file, you are 
> going to go back to 9G free... 

Only using it for backup storage is a special case where this is not so 
bad. Backuppc also has a way to rsync against the stored copy so 
matching files (or parts) may not need to be transfered at all.

> 3) Rather than continuously looking for matches on block level, you have to 
> search for matches on files that can be any size. That is fine if you have a 
> 100K file - but if you have a 100M or larger file, the checksum calculations 
> will take you forever. 

The backuppc scheme is to use a hash of some amount of the uncompressed 
file as a pooled filename for the link to quickly weed out most 
possibilities and permit the compression level to be changed.  The full 
check then only has to be done on collisions.

> This means rather than adding a specific, small 
> penalty to every write call, you add a unknown penalty, proportional to file 
> size when closing the file. Also, the fact that most C coders don't check the 
> return code of close doesn't make me happy there... 

In backuppc, the writer understands the scheme - and the linking is 
somewhat decoupled from the tranfers.  But, even in a normal filesystem 
writes are buffered and if you don't fsync there is a lot that can go 
wrong after a close() reports success.

-- 
   Les Mikesell
    lesmikesell at gmail.com