On 12/14/2013, 04:00 , lists at benjamindsmith.com wrote: > We checked lsyncd out and it's most certainly an very interesting tool. > I*will* be using it in the future! > > However, we found that it has some issues scaling up to really big file > stores that we haven't seen (yet) with ZFS. > > For example, the first thing it has to do when it comes online is a > fully rsync of the watched file area. This makes sense; you need to do > this to ensure integrity. But if you have a large file store, EG: many > millions of files and dozens of TB then this first step can take days, > even if the window of downtime is mere minutes due to a restart. Since > we're already at this stage now (and growing rapidly!) we've decided to > keep looking for something more elegant and ZFS appears to be almost an > exact match. We have not tested the stability of lsyncd managing the > many millions of inode write notifications in the meantime, but just > trying to satisfy the write needs for two smaller customers (out of > hundreds) with lsyncd led to crashes and the need to modify kernel > parameters. > > As another example, lsyncd solves a (highly useful!) problem of > replication, which is a distinctly different problem than backups. > Replication is useful, for example as a read-only cache for remote > application access, or for disaster recovery with near-real-time > replication, but it's not a backup. If somebody deletes a file > accidentally, you can't go to the replicated host and expect it to be > there. And unless you are lsyncd'ing to a remote file system with it's > own snapshot capability, there isn't an easy way to version a backup > short of running rsync (again) on the target to create hard links or > something - itself a very slow, intensive process with very large > filesystems. (days) > > I'll still be experimenting with lsyncd further to evaluate its real > usefulness and performance compared to ZFS and report results. As said > before, we'll know much more in another month or two once our next stage > of roll out is complete. > > -Ben Hi Ben, Yes, the initial replication of a large filesystem is *very* time consuming! But it makes sleeping at night much easier. I did have to crank up the inotify kernel parameters by a significant amount. I did the initial replication using rsync directly, rather than asking lsyncd to do it. I notice that if I reboot the primary server, it takes a while for the inotify tables to be rebuilt ... after that it's smooth sailing. If you want to prevent deletion of files from your replicated filesystem (which I do), you can modify the rsync{} array in the lsyncd.lua file by adding the line 'delete = false' to it. This has saved my butt a few times when a user has accidentally deleted a file on the primary server. I agree that filesystem replication isn't really a backup, but for now it's all I have available, but at least the replicated fs is on a separate machine. As a side note for anyone using a file server for hosting OS-X Time Machine backups, the 'delete' parameter in rsync{} must be set to 'true' in order to prevent chaos should a user need to point their Mac at the replicate filesystem (which should be a very rare event). I put all TM backups in a separate ZFS sub-pool for this reason. Chuck