[CentOS] ZFS on Linux in production?

On Thu, Oct 24, 2013 at 01:41:17PM -0700, Lists wrote:
> We are a CentOS shop, and have the lucky, fortunate problem of having 
> ever-increasing amounts of data to manage. EXT3/4 becomes tough to 
> manage when you start climbing, especially when you have to upgrade, so 
> we're contemplating switching to ZFS.
> 
> As of last spring, it appears that ZFS On Linux http://zfsonlinux.org/ 
> calls itself production ready despite a version number of 0.6.2, and 
> being acknowledged as unstable on 32 bit systems.
> 
> However, given the need to do backups, zfs send sounds like a godsend 
> over rsync which is running into scaling problems of its own. (EG: 
> Nightly backups are being threatened by the possibility of taking over 
> 24 hours per backup)
> 
> Was wondering if anybody here could weigh in with real-life experience? 
> Performance/scalability?
> 
> -Ben
> 
> PS: I joined their mailing list recently, will be watching there as 
> well. We will, of course, be testing for a while before "making the 
> switch".

Joining the discussion late, and don't really have anything to
contribute on the ZFSonLinux side of things...

At $DAYJOB we have been running ZFS via Nexenta (previously via Solaris
10) for many years.  We have about 5PB of this and the primary use case
is for backups and handling of imagery.

For the most part, we really, really like ZFS.  My feeling is that ZFS
itself (at least in the *Solaris form) is rock solid and stable.  Other
pieces of the stack -- namely SMB/CIFS and some of the management tools
provided by the various vendors are a bit more questionable.  We spend
a bit more time fighting weirdnesses with things higher up the stack
than we do say on our NetApp environment.  Too be expected.

I'm waiting for Red Hat or someone else to come out and support ZFS --
perhaps unlikely due to legality questions, but if I could marry the
power of ZFS with the software stack in Linux (Samba!!), I'd be mighty
happy.  Yes -- we could run Samba on our Nexenta boxes, but it isn't
"supported".

Echo'ing what many others say:

- ZFS is memory hungry.  All of our PRD boxes have 144GB of memory in
  them, and some have SSD's for ZIL or L2ARC depending on the workload.
- Powerful redundancy is possible.  Our environment is built on top of
  Dell MD1200 JBOD's all dual pathed up to dual LSI SAS switches.  Our
  vdev's (RAID groups) are sized to match the number of JBODs with the
  invididual disks spread across each JBOD.  We use triple parity RAID
  (RAIDZ3) and as such can lose three entire JBODs without suffering
  any data loss.  We actually had one JBOD go flaky on us and were able
  to hot yank it out, put in a new one with zero downtime (and much
  shorter resilver/rebuild times than you'd get with regular RAID).
- We make heavy use of snapshots and clones.  Probably have 200-300 on
  some sysems and we use them to do release management for collections
  of imagery.  Very powerful and haven't run into performance issues
  yet.
  * Snapshots let us take "diffs" between versions quite easily.  We
    then stream these diffs to an identical ZFS system at a DR site and
    merge in the changes.  Our network pipe isn't big enough yet to do
    this quickly, so we typically just plug in another SAS JBOD with a
    zpool on it, stream the diffs there as a flat file, sneakernet the
    JBOD to the DR site, plug it in, import the zpool and slurp in the
    differences.  Pretty cool.

As I mentioned, we have run into a few weird quirks.  Mainly around
stability of the management GUI (or lack of basic features like
"useful" SNMP based monitoring), performance with CIFS and oddnesses
like high system load in certain edge cases.  Some general rough edges
I suppose that we've been OK dealing with.  The Nexenta guys are super
smart, but of course they're a smaller shop and don't have the
resources behind them that CentOS does with Red Hat.

My guess is that this would be exacerbated to some extent on the Linux
platform at this point.  I personally wouldn't want to use ZFS on Linux
for our customer data serving workloads, but might consider it for
something purely internal.

Ray