[CentOS] Btrfs going forward, was: Errors on an SSD drive

Fri Aug 11 18:12:23 UTC 2017
Warren Young <warren at etr-usa.com>

On Aug 11, 2017, at 11:00 AM, Chris Murphy <lists at colorremedies.com> wrote:
> 
> On Fri, Aug 11, 2017 at 5:41 AM, hw <hw at gc-24.de> wrote:
>> That´s one thing I´ve been wondering about:  When using btrfs RAID, do you
>> need to somehow monitor the disks to see if one has failed?
> 
> Yes.
> 
> The block layer has no faulty device handling

That is one of the open questions about Stratis: should its stratisd act in the place of smartd?

Vote and comment on its GitHub issue here:

    https://github.com/stratis-storage/stratisd/issues/72

I’m in favor of it.  The daemon had to be there anyway, it makes sense to push SMART failure indicators up through the block layer into the volume manager layer so it can react intelligently to the failure, and FreeBSD’s ZFS is getting such a daemon soon so we want one, too:

    https://www.phoronix.com/scan.php?page=news_item&px=ZFSD-For-FreeBSD

>>> Also FWIW Red Hat is deprecating Btrfs, in the RHEL 7.4 announcement.
>>> Support will be removed probably in RHEL 8. I have no idea how it'll
>>> affect
>>> CentOS kernels though. It will remain in Fedora kernels.

I rather doubt btrfs will be compiled out of the kernel in EL8, and even if it is, it’ll probably be in the CentOSPlus kernels.

What you *won’t* get from Red Hat is the ability to install EL8 onto a btrfs volume from within Anaconda, the btrfs tools won’t be installed by default, and if you have a Red Hat subscription, they won’t be all that willing to help you with btrfs-related problems.

But will you be able to install EL8 onto an existing XFS-formatted boot volume and mount your old btrfs data volume?  I guess “yes.”

I suspect you’ll even be able to manually create new btrfs data volumes in EL8.

>> That would suck badly to the point at which I´d have to look for yet another
>> distribution.  The only one ramaining is arch.

openSUSE defaults to btrfs on root, though XFS on /home for some reason:

    https://goo.gl/Hiuzbu

>> What do they suggest as a replacement?

Stratis: https://stratis-storage.github.io/StratisSoftwareDesign.pdf

The main downside to Stratis I see is that it looks like 1.0 is scheduled to coincide with RHEL 8, based on the release dates of RHELs past, which means it won’t have any kind of redundant storage options to begin with, not even RAID-1, the only meaningful RAID level when it comes to comparing against btrfs.

The claim is that “enterprise” users don’t want software RAID anyway, so they don’t need to provide it in whatever version of Stratis ships with EL 8.  I think my reply to that holds true for many of us CentOS users:

    https://github.com/stratis-storage/stratis-docs/issues/54

Ah well, my company has historically been skipping even-numbered RHEL releases anyway due to lack of compelling reasons to migrate from the prior odd-numbered release still being supported.  Maybe Stratis will be ready for prime time by the time EL9 ships.

>> removing btrfs alltogether would be taking living in the past too
>> many steps too far.

The Red Hat/Fedora developers are well aware that they started out ~7 years behind when they pushed btrfs forward as a “technology preview” with RHEL 6, and are now more like 12 years behind the ZFS world after waiting in vain for btrfs to catch up.

Basically, Stratis is their plan to catch up on the cheap, building atop existing, tested infrastructure already in Linux.

My biggest worry is that because it’s not integrated top-to-bottom like ZFS is, they’ll miss out on some of the key advantages you have with ZFS.

I’m all for making the current near-manual LVM2 + MD + DM + XFS lash-up more integrated and automated, even if it’s just a pretty face in front of those same components.  The question is how well that interface mimics the end user experience of ZFS, which in my mind still provides the best CLI experience, even if you compare only on features they share in common.  btrfs’ tools are close, but I guess the correct command much more often with ZFS’ tools.

That latter is an explicit goal of the Stratis project.  They know that filesystem maintenance is not a daily task for most of us, so that we tend to forget commands, since we haven’t used them in months.  It is a major feature of a filesystem to have commands you can guess correctly based on fuzzy memories of having used them once months ago.

> Canonical appears to be charging ahead with OpenZFS included by
> default out of the box (although not for rootfs yet I guess)

Correct.  ZFS-on-root-on-Ubuntu is still an unholy mess:

    https://github.com/zfsonlinux/zfs/wiki/Ubuntu

> I can't tell you for sure what ZoL's faulty device behavior is
> either, whether it ejects faulty or flaky devices and when, or if like
> Btrfs is just tolerates it.

Lacking something like zfsd, I’d guess it just tolerates it, and that you need to pair it with smartd to have notification of failing devices.  You could script that to have automatic spare replacement.

Or, port FreeBSD’s zfsd over.