[CentOS] mounting XFS RAID-1 disk partition that needs repair.

Wed Nov 24 18:13:36 UTC 2021
Simon Matter <simon.matter at invoca.ch>

> haven't tried the suggestions yet, but here is some diagnostics on what
> happens when I attempt to mount it:
> upon running *mdadm --assemble /dev/md40 /mnt/dvd --run*, info from
> /var/log/messages):
> (note that /mnt/dvd is just an empty mount point that exists, used here
> for
> convenience).
>
> Nov 24 12:21:42 fcshome kernel: md: md40 stopped.
> Nov 24 12:21:42 fcshome kernel: md/raid1:md40: active with 1 out of 2
> mirrors
> Nov 24 12:21:42 fcshome kernel: md40: detected capacity change from 0 to
> 996887429120
>
> output from doing:
> sudo mount /dev/md40 /mnt/dvd
> mount: mount /dev/md40 on /mnt/dvd failed: Structure needs cleaning
>
> corresponding items from /var/log/messages:
> Nov 24 12:22:55 fcshome kernel: XFS (md40): Superblock earlier than
> Version
> 5 has XFS_[PQ]UOTA_{ENFD|CHKD} bits.
> Nov 24 12:22:55 fcshome kernel: XFS (md40): Metadata corruption detected
> at
> xfs_sb_read_verify+0x122/0x160 [xfs], xfs_sb block 0xff
> ffffffffffffff
> Nov 24 12:22:55 fcshome kernel: XFS (md40): Unmount and run xfs_repair
> Nov 24 12:22:55 fcshome kernel: XFS (md40): First 128 bytes of corrupted
> metadata buffer:
> Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0000: 58 46 53 42 00 00 10 00
> 00 00 00 00 0e 81 b1 e0  XFSB............
> Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0010: 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00  ................
> Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0020: d2 22 a7 30 dd 88 48 8b
> bd bb 9c 8b 2a 22 72 cc  .".0..H.....*"r.
> Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0030: 00 00 00 00 08 00 00 04
> 00 00 00 00 00 00 00 80  ................
> Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0040: 00 00 00 00 00 00 00 81
> 00 00 00 00 00 00 00 82  ................
> Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0050: 00 00 00 01 00 74 0d 8f
> 00 00 00 20 00 00 00 00  .....t..... ....
> Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0060: 00 00 80 00 30 c4 02 00
> 01 00 00 10 00 00 00 00  ....0...........
> Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0070: 00 00 00 00 00 00 00 00
> 0c 09 08 04 17 00 00 19  ................
> Nov 24 12:22:55 fcshome kernel: XFS (md40): SB validate failed with error
> -117.
>
> running xfs_repair give:
> sudo xfs_repair /dev/md40
> Phase 1 - find and verify superblock...
> xfs_repair: V1 inodes unsupported. Please try an older xfsprogs.
>
> before proceeding with other experiments, I decided to use dd to create an
> image file on my local disk of that partition so I could mess with it with
> less chance of trashing the on-disk partition. when attempting to use it,
> I
> get:
>
> sudo mdadm --assemble /dev/md41 ./part4.img --run
> mdadm: ./part4.img is not a block device.
> mdadm: ./part4.img has no superblock - assembly aborted
>
> So, I thought maybe the image had somehow become corrupted, so I did:
>
> sudo md5sum /dev/sdd4
> bd7cac3c886e7b3110e28100e119bb82  /dev/sdd4
>
> and
>
> md5sum part4.img
> bd7cac3c886e7b3110e28100e119bb82  part4.img
>
> which shows the partition and its disk image to be identical.
>
> Why shouldn't a dd image of a partition work just as well (for my
> purposes)
> as the actual disk partition? I've certainly done this before with EXTn
> and
> NTFS filesystems, is XFS somehow different in this regard?
>
> Do any of you know what I'm doing wrong here?

I'm not sure but I think you are making it too complicated.

If the partition is from a software RAID 1, then you should be able to use
it directly without building an mdadm array.

That said, it depends on the metadata type IIRC. If metadata is in the
beginning of the partition, then you have to remove it by doing a dd to a
file and skipping the metadata in the beginning of the partition.

md raid metadata locations:
0.9 	At the end of the device
1.0 	At the end of the device
1.1 	At the beginning of the device
1.2 	4K from the beginning of the device

So with metadata versions 0.9 0r 1.0, you could directly use the md
partition like a normal partition, only some bytes in the end are not used
by the filesystem.

Regards,
Simon