[CentOS] Was, Re: raid 5 install, is ZFS

Tue Jul 2 07:09:13 UTC 2019
Warren Young <warren at etr-usa.com>

On Jul 1, 2019, at 9:44 AM, mark <m.roth at 5-cent.us> wrote:
> 
> it was on Ubuntu, but that shouldn't make a difference, I would think

Indeed not.  It’s been years since the OS you were using implied a large set of OS-specific ZFS features.

There are still differences among the implementations, but the number of those is getting smaller as the community converges on ZoL as the common base.

Over time, the biggest difference among ZFS implementations will be time-based: a ZFS pool created in 2016 will have fewer feature flags than one created in 2019, so the 2019 pool won’t import on older OSes.

> I pulled one drive, to simulate a drive failure, and it
> rebuilt with the hot spare. Then I pushed the drive I'd pulled back in...
> and it does not look like I've got a hot spare. zpool status shows
> config:

I think you’re expecting more than ZFS tries to deliver here.  Although it’s filesystem + RAID + volume manager, it doesn’t also include storage device management features.

If you need this kind of thing to just happen automagically, you probably want to configure zed:

    https://zfsonlinux.org/manpages/0.8.0/man8/zed.8.html

But, if you can spare human cycles to deal with it, you don’t need zed.

What’s happened here is that you didn’t tell ZFS that the disk is no longer part of the pool, so that when it came back, ZFS says, “Hey, I recognize that disk!  It belonged to me once.  It must be mine again.”  But then it goes and tries to fit it into the pool and finds that there are no gaps to stick it into.

So, one option is to remove that replaced disk from the pool, then reinsert it as the new hot spare:

    $ sudo zpool remove export1 sdb
    $ sudo zpool add export1 spare sdb

The first command removes the ZFS header info from the disk, and the second puts it back on, marking it as a spare.

Alternately, you can relieve your prior hot spare (sdl) from its new duty — “new sdb” — putting sdb back in its prior place:

    $ sudo zpool replace export1 sdl sdb

That does a full resilver of the replacement disk, a cost you already paid for with the hot spare failover, but it does have the advantage of keeping the disks in alphabetical order by /dev name, as you’d probably expect.

But, rather than get exercised about whether putting sdl between sda and sdc makes sense, I’d strongly encourage you to get away from raw /dev/sd? names.  The fastest path in your setup to logical device names is:

    $ sudo zpool export export1
    $ sudo zpool import -d /dev/disk/by-serial export1

All of the raw /dev/sd? names will change to /dev/disk/by-serial/* names, which I find to be the most convenient form for determining which disk is which when swapping out failed disks.  It doesn’t take a very smart set of remote “hands” at a site to read serial numbers off of disks to determine which is the faulted disk.

The main problem with that scheme is that pulling disks to read their labels works best with the pool exported.  If you want to be able to do device replacement with the pool online, you need some way to associate particular disks with their placement in the server’s drive bays.

To get there, you’d have to be using GPT-partitioned disks.  ZFS normally does that these days, creating one big partition that’s optimally-aligned, which you can then label with gdisk’s “c” command.

Having done that, then you can do “zfs import -d /dev/disk/by-partlabel” instead, which gets you the logical disk naming scheme I’ve spoken of twice in the other thread.

If you must use whole-disk vdevs, then I’d at least write the last few digits of each drive’s serial number on the drive cage or the end of the drive itself, so you can just tell the tech “remove the one marked ab212”.

Note by the way that all of this happened because you reintroduced a ZFS-labeled disk into the pool.  That normally doesn’t happen.  Normally, a replacment is a brand new disk, without any ZFS labeling on it, so you’d jump straight to the “zpool add” step.  The prior hot spare took over, so now you’re just giving the pool a hot spare again.