[CentOS] Deduplicated archives via hardlinks [Was: XFS or EXT3 ?]

Fri Dec 3 22:32:37 UTC 2010
Gavin Carr <gavin at openfusion.com.au>

On Fri, Dec 03, 2010 at 04:07:06PM -0600, Les Mikesell wrote:
>The backuppc scheme works pretty well in normal usage, but most
>file-oriented approaches to copy the whole backuppc archive have scaling
>problems because they have to track all the inodes and names to match up
>the hard links.

That's been my experience with other hard-linked based backup schemes as
well. For 'normal' sized backups they work pretty well, but for some
value of 'large' backups the number of inodes and the tree traversal
time starts to cause real performance problems.

I'd be interested to know how large people's backups are where they're
still seeing decent performance using approaches like this? I believe we
started seeing problems once we hit a few TB (on ext3)?

We've moved to brackup (http://code.google.com/p/brackup/) for these
reasons, and are doing nightly backups of 18TB of data quite happily.
Brackup does fancy chunk-based deduplication (somewhat like git), and so
avoids the hard link approach entirely.