[CentOS] ZFS on Linux testing effort

Fri Dec 6 17:08:08 UTC 2013
Chuck Munro <chuckm at seafoam.net>

On 04.12.2013 14:05, nux at li.nux.ro wrote:
>>> >>On 04.12.2013 14:05, John Doe wrote:
>>>>> >>>>From: Lists<lists at benjamindsmith.com>
>>>>> >>>>
>>>>>>> >>>>>>Our next big test is to try out ZFS filesystem send/receive in
>>>>>>> >>>>>>lieu
>>>>>>> >>>>>>of
>>>>>>> >>>>>>our current backup processes based on rsync. Rsync is a fabulous
>>>>>>> >>>>>>tool,
>>>>>>> >>>>>>but is beginning to show performance/scalability issues dealing
>>>>>>> >>>>>>with
>>>>>>> >>>>>>the
>>>>>>> >>>>>>many millions of files being backed up, and we're hoping that ZFS
>>>>>>> >>>>>>filesystem replication solves this.
>>>>> >>>>
>>>>> >>>>Not sure if I already mentioned it but maybe have a look at:
>>>>> >>>>?http://code.google.com/p/lsyncd/
>>> >>I'm not so sure inotify works well with millions of files, not to
>>> >>mention it uses rsync. :D
>>> >>
>>> >>-- Sent from the Delta quadrant using Borg technology! Nux!
>> >
>> >I can attest to the usefulness of 'lsyncd' for large numbers of files
>> >(our file server has almost 2 million in active use, with a second
>> >backup server that's lsync'd to the first.
>> >
>> >Things to note:
>> >- Yes, lsyncd does use rsync, but it issues an 'exclude *' followed by
>> >the list of only the file(s) that need updating at that moment.
>> >
>> >- The inotify service can be jacked waaaay up (three kernel
>> >parameters)
>> >to handle millions of files if you wish.  Just make sure you have lots
>> >of RAM.
> Be careful with it. Sadly I found out that inotify would consistently
> fail on InnoDB files (ibd); I had to use stupid while loops and check
> mtimes to perform some stuff that inotify-cron would've done much more
> elegantly ...
>
> -- Sent from the Delta quadrant using Borg technology! Nux!

Interesting point, something I didn't know.  Fortunately in my case 
there are no db files involved directly, just db dumps wrapped in a 
tarball along with other associated stuff, sent from other servers.

I would expect that lsync'ing db files could be a nasty non-stop process 
if the database is constantly being updated, so using db tools for 
replication would be best, configuring inotify/lsyncd to ignore the db 
directories.  I believe by default that lsyncd instructs rsync to do 
whole-file transfers, so a large db could be a real problem.

Thanks for the important heads-up!