[CentOS] Deduplicated archives via hardlinks [Was: XFS or EXT3 ?]
Bowie Bailey
Bowie_Bailey at BUC.com
Fri Dec 3 21:36:51 UTC 2010
On 12/3/2010 4:14 PM, Adam Tauno Williams wrote:
> On Fri, 2010-12-03 at 12:51 -0800, John R Pierce wrote:
>> On 12/03/10 12:25 PM, Les Mikesell wrote:
>>> Whenever anyone mentions backups, I like to plug the backuppc program
>>> (http://backuppc.sourceforge.net/index.html and packaged in EPEL). It
>>> uses compression and hardlinks all duplicate files to keep much more
>>> history than you'd expect on line with a nice web interface - and does
>>> pretty much everything automatically.
>> I'm curious how you backup backuppc, like for disaster recovery,
> I know nothing about backuppc; I don't use it. But we use rsync with
> the same concept for a deduplicated archive.
>
>> archival, etc? since all the files are in a giant mess of symlinks
> No, they are not symbolic links - they are *hard links*. That they are
> hard-links is the actual magic. Symbolic links would provide the
> automatic deallocation of expires files.
>
>> (for deduplication) with versioning, I'd have to assume the archive
>> volume gets really messy after awhile, and further, something like that
>> is pretty darn hard to make a replica of it.
> I don't see why; only the archive is deduplicated in this manner, and
> it certainly isn't "messy". One simply makes a backup [for us that
> means to tape - a disk is not a backup] of the most current snapshot.
Actually, making a backup of BackupPC's data pool (or just moving it to
new disks) does get messy. With a large pool there are so many
hardlinks that rsync has trouble dealing with it, eats all your memory,
and takes forever. This is a frequent topic of conversation on the
BackupPC list. However, the next major version of BackupPC is supposed
to use a different method of deduplication that will not use hardlinks
and will be much easier to back up.
--
Bowie
More information about the CentOS
mailing list