[CentOS] Kernel panic after removing SW RAID1 partitions, setting up ZFS.

Tue Apr 9 09:53:55 UTC 2019
Simon Matter <simon.matter at invoca.ch>

> In article <6566355.ijNRhnPfCt at tesla.schoolpathways.com>,
> Benjamin Smith <lists at benjamindsmith.com> wrote:
>> System is CentOS 6 all up to date, previously had two drives in MD RAID
>> configuration.
>>
>> md0: sda1/sdb1, 20 GB, OS / Partition
>> md1: sda2/sdb2, 1 TB, data mounted as /home
>>
>> Installed kmod ZFS via yum, reboot, zpool works fine. Backed up the
>> /home data
>> 2x, then stopped the sd[ab]2 partition with:
>>
>> mdadm --stop /dev/md1;
>> mdadm --zero-superblock /dev/sd[ab]1;
>
> Did you mean /dev/sd[ab]2 instead?
>
>> Removed /home in /etc/fstab. Used fdisk to set the partition type to gpt
>> for
>> sda2 and sdb2, then built *then destroyed* a ZFS mirror pool using the
>> two
>> partitions.
>>
>> Now the system won't boot, has a kernel panic. I'm remote, so I'll be
>> going in
>> tomorrow to see what's up. My assumption is that it has something to do
>> with
>> mdadm/RAID not being "fully removed".
>>
>> Any idea what I might have missed?
>
> I think it's because you clobbered md0 when you did --zero-superblock on
> sd[ab]1
> instead of 2.
>
> Don't you love it when some things count from 0 and others from 1?

That's really a problem but difficult to fix I guess. IMHO it's better to
keep things the way they are as long as the solution is not really better
than the old behavior. Maybe the new Linux Ethernet naming scheme can
serve as a bad example if you ask me.

But here, mdadm could have done better: --zero-superblock checks if the
device contains a valid md superblock, but it fails to also check if the
device belongs to a running md device :-(

If it turns out that this is your problem, maybe you could ask the mdadm
developers to improve it?

Regards,
Simon