[CentOS] LVM failure after CentOS 7.6 upgrade -- possible corruption

Wed Dec 5 17:56:42 UTC 2018
Simon Matter <simon.matter at invoca.ch>

> I've started updating systems to CentOS 7.6, and so far I have one
> failure.
>
> This system has two peculiarities which might have triggered the
> problem.  The first is that one of the software RAID arrays on this
> system is degraded.  While troubleshooting the problem, I saw similar
> error messages mentioned in bug reports indicating that sGNU/Linux
> ystems would not boot with degraded software RAID arrays.  The other
> peculiar aspect is that the system uses dm-cache.
>
> Logs from some of the early failed boots are not available, but before I
> completely fixed the problem, I was able to bring the system up once,
> and captured logs which look substantially similar to the initial boot.
> The content of /var/log/messages is here:
> 	https://paste.fedoraproject.org/paste/n-E6X76FWIKzIvzPOw97uw
>
> The output of lsblk (minus some VM logical volumes) is here:
> 	https://paste.fedoraproject.org/paste/OizFvMeGn81vF52VEvUbyg
>
> As best I can tell, the LVM tools were treating software RAID component
> devices as PVs, and detecting a conflict between those and the assembled
> RAID volume.  When running "pvs" on the broken system, no RAID volumes
> were listed, only component devices.  At the moment, I don't know if the
> LVs that were activated by the initrd were backed by component devices
> or the RAID devices, so it's possible that this bug might corrupt
> software RAID arrays.
>
> In order to correct the problem, I had to add a global_filter to
> /etc/lvm/lvm.conf and rebuild the initrd (dracut -f):
> 	global_filter = [ "r|vm_.*_data|", "a|sdd1|", "r|sd..|" ]
>
> This filter excludes the LVs that contain VM data, accepts "/dev/sdd1"
> which is the dm-cache device, and rejects all other partitions on
> SCSI(SATA) device nodes, as all of those are RAID component devices.
>
> I'm still working on the details of the problem, but I wanted to share
> what I know now in case anyone else might be affected.
>
> After updating, look at the output of "pvs" if you use LVM on software
> RAID.

What exactly did `pvs' show and instead of what?

Regards,
Simon