On Thu, Oct 24, 2013 at 01:41:17PM -0700, Lists wrote: > We are a CentOS shop, and have the lucky, fortunate problem of having > ever-increasing amounts of data to manage. EXT3/4 becomes tough to > manage when you start climbing, especially when you have to upgrade, so > we're contemplating switching to ZFS. > > As of last spring, it appears that ZFS On Linux http://zfsonlinux.org/ > calls itself production ready despite a version number of 0.6.2, and > being acknowledged as unstable on 32 bit systems. > > However, given the need to do backups, zfs send sounds like a godsend > over rsync which is running into scaling problems of its own. (EG: > Nightly backups are being threatened by the possibility of taking over > 24 hours per backup) > > Was wondering if anybody here could weigh in with real-life experience? > Performance/scalability? > > -Ben > > PS: I joined their mailing list recently, will be watching there as > well. We will, of course, be testing for a while before "making the > switch". Joining the discussion late, and don't really have anything to contribute on the ZFSonLinux side of things... At $DAYJOB we have been running ZFS via Nexenta (previously via Solaris 10) for many years. We have about 5PB of this and the primary use case is for backups and handling of imagery. For the most part, we really, really like ZFS. My feeling is that ZFS itself (at least in the *Solaris form) is rock solid and stable. Other pieces of the stack -- namely SMB/CIFS and some of the management tools provided by the various vendors are a bit more questionable. We spend a bit more time fighting weirdnesses with things higher up the stack than we do say on our NetApp environment. Too be expected. I'm waiting for Red Hat or someone else to come out and support ZFS -- perhaps unlikely due to legality questions, but if I could marry the power of ZFS with the software stack in Linux (Samba!!), I'd be mighty happy. Yes -- we could run Samba on our Nexenta boxes, but it isn't "supported". Echo'ing what many others say: - ZFS is memory hungry. All of our PRD boxes have 144GB of memory in them, and some have SSD's for ZIL or L2ARC depending on the workload. - Powerful redundancy is possible. Our environment is built on top of Dell MD1200 JBOD's all dual pathed up to dual LSI SAS switches. Our vdev's (RAID groups) are sized to match the number of JBODs with the invididual disks spread across each JBOD. We use triple parity RAID (RAIDZ3) and as such can lose three entire JBODs without suffering any data loss. We actually had one JBOD go flaky on us and were able to hot yank it out, put in a new one with zero downtime (and much shorter resilver/rebuild times than you'd get with regular RAID). - We make heavy use of snapshots and clones. Probably have 200-300 on some sysems and we use them to do release management for collections of imagery. Very powerful and haven't run into performance issues yet. * Snapshots let us take "diffs" between versions quite easily. We then stream these diffs to an identical ZFS system at a DR site and merge in the changes. Our network pipe isn't big enough yet to do this quickly, so we typically just plug in another SAS JBOD with a zpool on it, stream the diffs there as a flat file, sneakernet the JBOD to the DR site, plug it in, import the zpool and slurp in the differences. Pretty cool. As I mentioned, we have run into a few weird quirks. Mainly around stability of the management GUI (or lack of basic features like "useful" SNMP based monitoring), performance with CIFS and oddnesses like high system load in certain edge cases. Some general rough edges I suppose that we've been OK dealing with. The Nexenta guys are super smart, but of course they're a smaller shop and don't have the resources behind them that CentOS does with Red Hat. My guess is that this would be exacerbated to some extent on the Linux platform at this point. I personally wouldn't want to use ZFS on Linux for our customer data serving workloads, but might consider it for something purely internal. Ray