[CentOS] Btrfs going forward, was: Errors on an SSD drive

hw hw at gc-24.de
Fri Aug 11 17:37:05 UTC 2017


Chris Murphy wrote:
> Changing the subject since this is rather Btrfs specific now.
>
>
>
> On Fri, Aug 11, 2017 at 5:41 AM, hw <hw at gc-24.de> wrote:
>> Chris Murphy wrote:
>>>
>>> On Wed, Aug 9, 2017, 11:55 AM Mark Haney <mark.haney at neonova.net> wrote:
>>>
>>>> To be honest, I'd not try a btrfs volume on a notebook SSD. I did that on
>>>> a
>>>> couple of systems and it corrupted pretty quickly. I'd stick with
>>>> xfs/ext4
>>>
>>>
>>> if you manage to get the drive working again.
>>>>
>>>>
>>>
>>> Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, the
>>> maintainers worked for FusionIO for several years of its development. If
>>> the drive is silently corrupting data, Btrfs will pretty much immediately
>>> start complaining where other filesystems will continue. Bad RAM can also
>>> result in scary warnings where you don't with other filesytems. And I've
>>> been using it in numerous SSDs for years and NVMe for a year with zero
>>> problems.
>>
>>
>> That´s one thing I´ve been wondering about:  When using btrfs RAID, do you
>> need to somehow monitor the disks to see if one has failed?
>
> Yes.
>
> The block layer has no faulty device handling, i.e. it just reports
> whatever problems the device or the controller report. Where md/mdadm
> and md/LVM have implemented policies for ejecting (setting a device to
> faulty) a block device. Btrfs does not do that, it'll just keep trying
> to use a faulty device.
>
> So you have to setup something that monitors for either physical
> device errors, or btrfs errors or both, depending on what you want.

I want to know when a drive has failed.  How can I monitor that?  I´ve begun
to use btrfs only recently.

>>> On CentOS though, I'd get newer btrfs-progs RPM from Fedora, and use
>>> either
>>> an elrepo.org kernel, a Fedora kernel, or build my own latest long-term
>>> from kernel.org. There's just too much development that's happened since
>>> the tree found in RHEL/CentOS kernels.
>>
>>
>> I can´t go with a more recent kernel version before NVIDIA has updated their
>> drivers to no longer need fence.h (or what it was).
>>
>> And I thought stuff gets backported, especially things as important as file
>> systems.
>
> There's 1500 to 3000 line changes to Btrfs code per kernel release.
> There's too much to backport most of it. Serious fixes do get
> backported by upstream to longterm kernels, but to what degree, you
> have to check the upstream changelogs to know about it.
>
> And right now most backports go to only 4.4 and 4.9. And I can't tell
> you what kernel-3.10.0-514.10.2.el7.x86_64.rpm translates into, that
> requires a secret decoder ring near as I can tell as it's a kernel
> made from multiple branches,  and then also a bunch of separate
> patches.

So these kernels are a mess.  What´s the point of backports when they aren´t
done correctly?

This puts a big stamp "stay away from" on RHEL/Centos.

>>> Also FWIW Red Hat is deprecating Btrfs, in the RHEL 7.4 announcement.
>>> Support will be removed probably in RHEL 8. I have no idea how it'll
>>> affect
>>> CentOS kernels though. It will remain in Fedora kernels.
>>
>>
>> That would suck badly to the point at which I´d have to look for yet another
>> distribution.  The only one ramaining is arch.
>>
>> What do they suggest as a replacement?  The only other FS that comes close
>> is
>> ZFS, and removing btrfs alltogether would be taking living in the past too
>> many
>> steps too far.
>
> Red Hat are working on a new user space wrapper and volume format
> based on md, device mapper, LVM, and XFS.
> http://stratis-storage.github.io/
> https://stratis-storage.github.io/StratisSoftwareDesign.pdf
>
> It's an aggressive development schedule and as so much of it is
> journaling and CoW based I have no way to assess whether it ends up

So in another 15 or 20 years, some kind of RH file system might become
usable.

I´d say the need to wake up because the need for features provided by ZFS
and btrfs already exists since years.  Even their current XFS implementation
is flawed because there is no way to install on an XFS that is adjusted to
the volume of the hardware RAID the XFS is created on as it is supposed to
be.

> [...]
> tested. But this is by far the most cross platform solution: FreeBSD,
> Illumos, Linux, macOS. And ZoL has RHEL/CentOS specific packages.

That can be an advantage.

What is the state of ZFS for Centos?  I´m going to need it because I have
data on some disks that were used for ZFS and now need to be read by a
machine running Centos.

Does it require a particular kernel version?

> But I can't tell you for sure what ZoL's faulty device behavior is
> either, whether it ejects faulty or flaky devices and when, or if like
> Btrfs is just tolerates it.

You can monitor the disks and see when one has failed.

> The elrepo.org folks can still sanely set CONFIG_BTRFS_FS=m, but I
> suspect if RHEL unsets that in RHEL 8 kernels, that CentOS will do the
> same.

Sanely?  With the kernel being such a mess?





More information about the CentOS mailing list