[CentOS] CentOS and LessFS

Wed Jan 18 03:31:23 UTC 2012
Les Mikesell <lesmikesell at gmail.com>

On Tue, Jan 17, 2012 at 6:41 PM, Nataraj <incoming-centos at rjl.com> wrote:
>
>> I wouldn't trust any of the software block-dedup systems with my only
>> copy of something important - plus they need a lot of RAM which your
>> old systems probably don't have either.
>>
>
> I am interested in backuppc, however from what I read online it appears
> that zfs is a very featureful robust  high performance filesystem that
> is heavily used in production environments.  It has features that allow
> you to specify that if the reference count for a block goes above
> certain levels it should keep two or three copies of that block and that
> could be on separate storage devices within the pool.  It also supports
> compression.

It's probably fine on Solaris where it has had years of development
and testing.  But I don't expect the linux ports to be very mature yet
- hence the lack of trust.

> With backuppc deduplication, your still hosed if your only
> copy of the file goes bad.  Why should block level deduplication be any
> worse than file level deduplication?

Nothing will fix a file if the disk underneath goes bad and you aren't
running raid.  And in my case I run raid1 and regularly swap disks out
for offsite copies and resync.  But, backuppc makes the links based on
an actual comparison, so if an old copy is somehow corrupted, the next
full will be stored separately, not linked.

> Furthermore, zfs has very high redundancy and recovery ability for the
> internal filesystem data structures.  Here's a video describing ZFS's
> deduplication implementation:  http://blogs.oracle.com/video/entry/zfs_dedup

I agree that the design sounds good and I'd probably be using it if I
used solaris - or maybe even the freebsd.

> At this point I am only reading the experience of others, but I am
> inclined to try it.  I backup a mediawiki/mysql database and the new
> records are added to the database largely by appending.  Even with
> compression, it's a pain to backup the whole thing every day.  Block
> level dedup seems like it would be a good solution for that.

You are still going to have to go through the motions of copying the
whole thing and letting the receiving filesystem do hash comparisons
on each block to accomplish the dedup.

> Les, do you run backuppc on ext3 or ext4 filesystems?  I remember a
> while back, someone saying that a filesystem with more inodes was
> required for substantial backuppc deployment.

That really depends on the size of the files you back up and how much
churn there is in the history you keep.   I wouldn't expect it to be a
problem unless you have a lot of users with big maildir type
directories.   Eons ago when I used it with smaller drives and the
alternative was ext2 I used reiserfs, but more recently I just use
ext3 (and 4 in the newest setup) with the defaults.  Some people on
the backuppc mail list prefer xfs, though.

-- 
   Les Mikesell
      lesmikesell at gmail.com



>
>
> Nataraj
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos