[CentOS] Problem with mdadm, raid1 and automatically adds any disk to raid

Mon Feb 25 05:50:11 UTC 2019
Simon Matter <simon.matter at invoca.ch>

> Hi.
>
> CENTOS 7.6.1810, fresh install - use this as a base to create/upgrade
> new/old machines.
>
> I was trying to setup two disks as a RAID1 array, using these lines
>
>   mdadm --create --verbose /dev/md0 --level=0 --raid-devices=2 /dev/sdb1
> /dev/sdc1
>   mdadm --create --verbose /dev/md1 --level=0 --raid-devices=2 /dev/sdb2
> /dev/sdc2
>   mdadm --create --verbose /dev/md2 --level=0 --raid-devices=2 /dev/sdb3
> /dev/sdc3
>
> then I did a lsblk and realized that I used --level=0 instead of --level=1
> (spelling mistake)
> The SIZE was reported double as I created a striped set by mistake, yet I
> wanted the mirrored.
>
> Here starts my problem, I cannot get rid of the /dev/mdX no matter what I
> do (try to do).
>
> I tried to delete the MDX, I removed the disks by failing them, then
> removing each array md0, md1 and md2.
> I also did
>
>   dd if=/dev/zero of=/dev/sdX bs=512 seek=$(($(blockdev --getsz
> /dev/sdX)-1024)) count=1024

I didn't check but are you really sure you're cleaning up the end of the
drive? Maybe you should clean the end of every partition first because
metadata may be written there.

>   dd if=/dev/zero of=/dev/sdX bs=512 count=1024
>   mdadm --zero-superblock /dev/sdX
>
> Then I wiped each partition of the drives using fdisk.
>
> Now every time I start fdisk to setup a new set of partitions I see in
> /var/log/messages as soon as I hit "W" in fdisk:
>
>   Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives
> before activating degraded array md2..
>   Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives
> before activating degraded array md1..
>   Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives
> before activating degraded array md0..
>   Feb 25 15:38:32 webber kernel: md/raid1:md0: active with 1 out of 2
> mirrors
>   Feb 25 15:38:32 webber kernel: md0: detected capacity change from 0 to
> 5363466240
>   Feb 25 15:39:02 webber systemd: Created slice
> system-mdadm\x2dlast\x2dresort.slice.
>   Feb 25 15:39:02 webber systemd: Starting Activate md array md1 even
> though degraded...
>   Feb 25 15:39:02 webber systemd: Starting Activate md array md2 even
> though degraded...
>   Feb 25 15:39:02 webber kernel: md/raid1:md1: active with 0 out of 2
> mirrors
>   Feb 25 15:39:02 webber kernel: md1: failed to create bitmap (-5)
>   Feb 25 15:39:02 webber mdadm: mdadm: failed to start array /dev/md/1:
> Input/output error
>   Feb 25 15:39:02 webber systemd: mdadm-last-resort at md1.service: main
> process exited, code=exited, status=1/FAILURE
>
> I check /proc/mdstat and sure enough, there it is trying to assemble an
> Array I DID NOT TOLD IT TO DO.
>
> I do NOT WANT this to happen, it creates the same "SHIT" (the incorrect
> array) over and over again (systemd frustration).

Noooooo, you're wiping it wrong :-)

> So I tried to delete them again, wiped them again, killed processes, wiped
> disks.
>
> No matter what I do as soon as I hit the "w" in fdisk systemd tries to
> assemble the array again without letting me to decide what to do.

<don't try this at home>
Nothing easier than that, just terminate systemd while doing the disk
management and restart it after you're done. BTW, PID is 1.
</don't try this at home>

Seriously, there is certainly some systemd unit you may be able to
deactivate before doing such things. However, I don't know which one it
is.

I've been fighting a similar crap: On HPE servers when running
cciss_vol_status through the disk monitoring system, whenever
cciss_vol_status is run and reports hardware RAID status, systemd scans
all partition tables and tries to detect LVM2 devices and whatever. Kernel
log is just filled with useless scans and I have no idea how to get rid of
it. Nice new systemd world.

Regards,
Simon