[CentOS] ZFS on Linux testing effort

Fri Dec 13 21:52:13 UTC 2013
Lists <lists at benjamindsmith.com>

On 12/04/2013 06:05 AM, John Doe wrote:
> Not sure if I already mentioned it but maybe have a look at: 
>  http://code.google.com/p/lsyncd/

We checked lsyncd out and it's most certainly an very interesting tool. 
I *will* be using it in the future!

However, we found that it has some issues scaling up to really big file 
stores that we haven't seen (yet) with ZFS.

For example, the first thing it has to do when it comes online is a 
fully rsync of the watched file area. This makes sense; you need to do 
this to ensure integrity. But if you have a large file store, EG: many 
millions of files and dozens of TB then this first step can take days, 
even if the window of downtime is mere minutes due to a restart. Since 
we're already at this stage now (and growing rapidly!) we've decided to 
keep looking for something more elegant and ZFS appears to be almost an 
exact match. We have not tested the stability of lsyncd managing the 
many millions of inode write notifications in the meantime, but just 
trying to satisfy the write needs for two smaller customers (out of 
hundreds) with lsyncd led to crashes and the need to modify kernel 
parameters.

As another example, lsyncd solves a (highly useful!) problem of 
replication, which is a distinctly different problem than backups. 
Replication is useful, for example as a read-only cache for remote 
application access, or for disaster recovery with near-real-time 
replication, but it's not a backup. If somebody deletes a file 
accidentally, you can't go to the replicated host and expect it to be 
there. And unless you are lsyncd'ing to a remote file system with it's 
own snapshot capability, there isn't an easy way to version a backup 
short of running rsync (again) on the target to create hard links or 
something - itself a very slow, intensive process with very large 
filesystems. (days)

I'll still be experimenting with lsyncd further to evaluate its real 
usefulness and performance compared to ZFS and report results. As said 
before, we'll know much more in another month or two once our next stage 
of roll out is complete.

-Ben