On 12/3/2010 3:14 PM, Adam Tauno Williams wrote: > > I know nothing about backuppc; I don't use it. But we use rsync with > the same concept for a deduplicated archive. Backuppc is a couple of perl scripts, one of which happens to re-implement rsync in a way that lets it use stock rsync on the remote while transparently accessing a compressed copy on the server side. It can also use tar or samba to copy files in, then does the same compression/dedup operation. >> (for deduplication) with versioning, I'd have to assume the archive >> volume gets really messy after awhile, and further, something like that >> is pretty darn hard to make a replica of it. > > I don't see why; only the archive is deduplicated in this manner, and > it certainly isn't "messy". One simply makes a backup [for us that > means to tape - a disk is not a backup] of the most current snapshot. I does get messy because backuppc archives typically have millions of hardlinked files. It doesn't just hardlink between subsequent runs of the same machine, it hardlinks all files with identical content from the same machine or other, using a pool directory of hashed filenames as a common link to match them up quickly. > The script just looks like - > > export ROOT="/srv/cifs/Arabis-Red" > export STAMP=`date +%Y%m%d%H` > export LASTSTAMP=`cat $ROOT/LAST.STAMP` > mkdir $ROOT/$STAMP > mkdir $ROOT/$STAMP/home > > nice rsync --verbose --archive --delete --acls \ > --link-dest $ROOT/$LASTSTAMP/home/ \ > --numeric-ids \ > -e ssh \ > archivist at arabis-red:/home/ \ > $ROOT/$STAMP/home/ \ > 2>&1> $ROOT/$STAMP/home.log > > echo $STAMP> $ROOT/LAST.STAMP But that won't match up multiple copies of the same file in different locations or help with many machines with mostly-duplicate content. The backuppc scheme works pretty well in normal usage, but most file-oriented approaches to copy the whole backuppc archive have scaling problems because they have to track all the inodes and names to match up the hard links. -- Les Mikesell lesmikesell at gmail.com