[CentOS] mounting XFS RAID-1 disk partition that needs repair.

Wed Nov 24 19:32:38 UTC 2021
Fred <fred.fredex at gmail.com>

Ah HA! I b elieve I saw in a dump somewhere that it was 0.9. so I should be
off to the races!

Thanks!

Fred

On Wed, Nov 24, 2021 at 1:14 PM Simon Matter <simon.matter at invoca.ch> wrote:

> > haven't tried the suggestions yet, but here is some diagnostics on what
> > happens when I attempt to mount it:
> > upon running *mdadm --assemble /dev/md40 /mnt/dvd --run*, info from
> > /var/log/messages):
> > (note that /mnt/dvd is just an empty mount point that exists, used here
> > for
> > convenience).
> >
> > Nov 24 12:21:42 fcshome kernel: md: md40 stopped.
> > Nov 24 12:21:42 fcshome kernel: md/raid1:md40: active with 1 out of 2
> > mirrors
> > Nov 24 12:21:42 fcshome kernel: md40: detected capacity change from 0 to
> > 996887429120
> >
> > output from doing:
> > sudo mount /dev/md40 /mnt/dvd
> > mount: mount /dev/md40 on /mnt/dvd failed: Structure needs cleaning
> >
> > corresponding items from /var/log/messages:
> > Nov 24 12:22:55 fcshome kernel: XFS (md40): Superblock earlier than
> > Version
> > 5 has XFS_[PQ]UOTA_{ENFD|CHKD} bits.
> > Nov 24 12:22:55 fcshome kernel: XFS (md40): Metadata corruption detected
> > at
> > xfs_sb_read_verify+0x122/0x160 [xfs], xfs_sb block 0xff
> > ffffffffffffff
> > Nov 24 12:22:55 fcshome kernel: XFS (md40): Unmount and run xfs_repair
> > Nov 24 12:22:55 fcshome kernel: XFS (md40): First 128 bytes of corrupted
> > metadata buffer:
> > Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0000: 58 46 53 42 00 00 10 00
> > 00 00 00 00 0e 81 b1 e0  XFSB............
> > Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0010: 00 00 00 00 00 00 00 00
> > 00 00 00 00 00 00 00 00  ................
> > Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0020: d2 22 a7 30 dd 88 48 8b
> > bd bb 9c 8b 2a 22 72 cc  .".0..H.....*"r.
> > Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0030: 00 00 00 00 08 00 00 04
> > 00 00 00 00 00 00 00 80  ................
> > Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0040: 00 00 00 00 00 00 00 81
> > 00 00 00 00 00 00 00 82  ................
> > Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0050: 00 00 00 01 00 74 0d 8f
> > 00 00 00 20 00 00 00 00  .....t..... ....
> > Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0060: 00 00 80 00 30 c4 02 00
> > 01 00 00 10 00 00 00 00  ....0...........
> > Nov 24 12:22:55 fcshome kernel: ffff8e0c8f4e0070: 00 00 00 00 00 00 00 00
> > 0c 09 08 04 17 00 00 19  ................
> > Nov 24 12:22:55 fcshome kernel: XFS (md40): SB validate failed with error
> > -117.
> >
> > running xfs_repair give:
> > sudo xfs_repair /dev/md40
> > Phase 1 - find and verify superblock...
> > xfs_repair: V1 inodes unsupported. Please try an older xfsprogs.
> >
> > before proceeding with other experiments, I decided to use dd to create
> an
> > image file on my local disk of that partition so I could mess with it
> with
> > less chance of trashing the on-disk partition. when attempting to use it,
> > I
> > get:
> >
> > sudo mdadm --assemble /dev/md41 ./part4.img --run
> > mdadm: ./part4.img is not a block device.
> > mdadm: ./part4.img has no superblock - assembly aborted
> >
> > So, I thought maybe the image had somehow become corrupted, so I did:
> >
> > sudo md5sum /dev/sdd4
> > bd7cac3c886e7b3110e28100e119bb82  /dev/sdd4
> >
> > and
> >
> > md5sum part4.img
> > bd7cac3c886e7b3110e28100e119bb82  part4.img
> >
> > which shows the partition and its disk image to be identical.
> >
> > Why shouldn't a dd image of a partition work just as well (for my
> > purposes)
> > as the actual disk partition? I've certainly done this before with EXTn
> > and
> > NTFS filesystems, is XFS somehow different in this regard?
> >
> > Do any of you know what I'm doing wrong here?
>
> I'm not sure but I think you are making it too complicated.
>
> If the partition is from a software RAID 1, then you should be able to use
> it directly without building an mdadm array.
>
> That said, it depends on the metadata type IIRC. If metadata is in the
> beginning of the partition, then you have to remove it by doing a dd to a
> file and skipping the metadata in the beginning of the partition.
>
> md raid metadata locations:
> 0.9     At the end of the device
> 1.0     At the end of the device
> 1.1     At the beginning of the device
> 1.2     4K from the beginning of the device
>
> So with metadata versions 0.9 0r 1.0, you could directly use the md
> partition like a normal partition, only some bytes in the end are not used
> by the filesystem.
>
> Regards,
> Simon
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos
>