lvm errors after replacing drive in raid 10 array

List overview All Threads
Download

newer

older

Difference between system-config-*...

Raw device gone after reboot...

Mike

18 Jul 2008 18 Jul '08

12:43 a.m.

I thought I'd test replacing a failed drive in a 4 drive raid 10 array on a CentOS 5.2 box before it goes online and before a drive really fails.

I 'mdadm failed, removed', powered off, replaced drive, partitioned with sfdisk -d /dev/sda | sfdisk /dev/sdb, and finally 'mdadm add'ed'.

Everything seems fine until I try to create a snapshot lv. (Creating a snapshot lv worked before I replaced the drive.) Here's what I'm seeing.

# lvcreate -p r -s -L 8G -n home-snapshot /dev/vg0/homelv Couldn't find device with uuid 'yIIGF9-9f61-QPk8-q6q1-wn4D-iE1x-MJIMgi'. Couldn't find all physical volumes for volume group vg0. Volume group for uuid not found: I4Gf5TUB1M1TfHxZNg9cCkM1SbRo8cthCTTjVHBEHeCniUIQ03Ov4V1iOy2ciJwm Aborting. Failed to activate snapshot exception store.

So then I try

# pvdisplay --- Physical volume --- PV Name /dev/md3 VG Name vg0 PV Size 903.97 GB / not usable 3.00 MB Allocatable yes PE Size (KByte) 4096 Total PE 231416 Free PE 44536 Allocated PE 186880 PV UUID yIIGF9-9f61-QPk8-q6q1-wn4D-iE1x-MJIMgi

Subsequent runs of pvdisplay eventually returns nothing. pvck /dev/md3 seems to restore that but creating a snapshot volume still fails.

It's as if the "PV stuff" is not on the new drive. I (probably incorrectly) assumed that just adding the drive back in to the raid array would take care of that.

I've searched quite a bit but have not found any clues. Any one?

-- Thanks, Mike

Show replies by date

Ross S. W. Walker

18 Jul 18 Jul

12:55 a.m.

Mike wrote:

...

I thought I'd test replacing a failed drive in a 4 drive raid 10 array on a CentOS 5.2 box before it goes online and before a drive really fails.

I 'mdadm failed, removed', powered off, replaced drive, partitioned with sfdisk -d /dev/sda | sfdisk /dev/sdb, and finally 'mdadm add'ed'.

Everything seems fine until I try to create a snapshot lv. (Creating a snapshot lv worked before I replaced the drive.) Here's what I'm seeing.

# lvcreate -p r -s -L 8G -n home-snapshot /dev/vg0/homelv Couldn't find device with uuid 'yIIGF9-9f61-QPk8-q6q1-wn4D-iE1x-MJIMgi'. Couldn't find all physical volumes for volume group vg0. Volume group for uuid not found: I4Gf5TUB1M1TfHxZNg9cCkM1SbRo8cthCTTjVHBEHeCniUIQ03Ov4V1iOy2ciJwm Aborting. Failed to activate snapshot exception store.

So then I try

# pvdisplay --- Physical volume --- PV Name /dev/md3 VG Name vg0 PV Size 903.97 GB / not usable 3.00 MB Allocatable yes PE Size (KByte) 4096 Total PE 231416 Free PE 44536 Allocated PE 186880 PV UUID yIIGF9-9f61-QPk8-q6q1-wn4D-iE1x-MJIMgi

Subsequent runs of pvdisplay eventually returns nothing. pvck /dev/md3 seems to restore that but creating a snapshot volume still fails.

It's as if the "PV stuff" is not on the new drive. I (probably incorrectly) assumed that just adding the drive back in to the raid array would take care of that.

I've searched quite a bit but have not found any clues. Any one?

It would be interesting to see what the mdadm --detail /dev/mdX says.

I see the VG is made out of 1 PV md3? What are md0,1,2 doing, I can guess md0 is probably /boot, but what about 1 and 2?

It wouldn't hurt to give the sfdisk partition dumps for the drives in question too.

-Ross

______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.

Mike

1:05 a.m.

On Thu, 17 Jul 2008, Ross S. W. Walker wrote:

...

It would be interesting to see what the mdadm --detail /dev/mdX says.

I see the VG is made out of 1 PV md3? What are md0,1,2 doing, I can guess md0 is probably /boot, but what about 1 and 2?

It wouldn't hurt to give the sfdisk partition dumps for the drives in question too.

-Ross

Thanks for the reply. md2 is /boot, md0 is /root and md1 is swap.

# mdadm --detail /dev/md3 /dev/md3: Version : 00.90.03 Creation Time : Fri Jul 4 17:11:30 2008 Raid Level : raid10 Array Size : 947883008 (903.97 GiB 970.63 GB) Used Dev Size : 473941504 (451.99 GiB 485.32 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 3 Persistence : Superblock is persistent

Update Time : Thu Jul 17 15:58:52 2008 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0

Layout : near=1, far=2 Chunk Size : 256K

UUID : 7ecb1de6:c6e22a3a:1bd5446a:1dcd5444 Events : 0.3852

Number Major Minor RaidDevice State 0 8 4 0 active sync /dev/sda4 1 8 20 1 active sync /dev/sdb4 2 8 36 2 active sync /dev/sdc4 3 8 52 3 active sync /dev/sdd4

# sfdisk -l /dev/sda

Disk /dev/sda: 60801 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System /dev/sda1 * 0+ 12 13- 104391 fd Linux raid autodetect /dev/sda2 13 1287 1275 10241437+ fd Linux raid autodetect /dev/sda3 1288 1797 510 4096575 fd Linux raid autodetect /dev/sda4 1798 60800 59003 473941597+ fd Linux raid autodetect

# sfdisk -l /dev/sdb

Disk /dev/sdb: 60801 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System /dev/sdb1 * 0+ 12 13- 104391 fd Linux raid autodetect /dev/sdb2 13 1287 1275 10241437+ fd Linux raid autodetect /dev/sdb3 1288 1797 510 4096575 fd Linux raid autodetect /dev/sdb4 1798 60800 59003 473941597+ fd Linux raid autodetect

# sfdisk -l /dev/sdc

Disk /dev/sdc: 60801 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System /dev/sdc1 * 0+ 12 13- 104391 fd Linux raid autodetect /dev/sdc2 13 1287 1275 10241437+ fd Linux raid autodetect /dev/sdc3 1288 1797 510 4096575 fd Linux raid autodetect /dev/sdc4 1798 60800 59003 473941597+ fd Linux raid autodetect

# sfdisk -l /dev/sdd

Disk /dev/sdd: 60801 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System /dev/sdd1 * 0+ 12 13- 104391 fd Linux raid autodetect /dev/sdd2 13 1287 1275 10241437+ fd Linux raid autodetect /dev/sdd3 1288 1797 510 4096575 fd Linux raid autodetect /dev/sdd4 1798 60800 59003 473941597+ fd Linux raid autodetect

Mike

4:37 p.m.

New subject: lvm errors after replacing drive in raid 10 array [SOLVED ?]

Just for the record I'm about 98.7% sure that the root problem here was that the LVM stuff (pvcreate, vgcreate, lvcreate) was done when booted from systemrescuecd and had nothing to do with replacing a failed drive.

The ouptut from 'pvcreate --version' on the systemrescuecd is: LVM version: 2.02.33 (2008-01-31) Library version: 1.02.26 (2008-06-06) Driver version: 4.13.0

And when booted from CentOS 5.2: LVM version: 2.02.32-RHEL5 (2008-03-04) Library version: 1.02.24 (2007-12-20) Driver version: 4.11.5

When [pv|vg|lv]create is done like it should have been (after booting CentOS) snapshot volume creation works as expected even after replacing a failed drive.

On Thu, 17 Jul 2008, Mike wrote:

...

I thought I'd test replacing a failed drive in a 4 drive raid 10 array on a CentOS 5.2 box before it goes online and before a drive really fails.

I 'mdadm failed, removed', powered off, replaced drive, partitioned with sfdisk -d /dev/sda | sfdisk /dev/sdb, and finally 'mdadm add'ed'.

Everything seems fine until I try to create a snapshot lv. (Creating a snapshot lv worked before I replaced the drive.) Here's what I'm seeing.

# lvcreate -p r -s -L 8G -n home-snapshot /dev/vg0/homelv Couldn't find device with uuid 'yIIGF9-9f61-QPk8-q6q1-wn4D-iE1x-MJIMgi'. Couldn't find all physical volumes for volume group vg0. Volume group for uuid not found: I4Gf5TUB1M1TfHxZNg9cCkM1SbRo8cthCTTjVHBEHeCniUIQ03Ov4V1iOy2ciJwm Aborting. Failed to activate snapshot exception store.

6225

Age (days ago)

6226

Last active (days ago)

discuss@lists.centos.org

3 comments

2 participants

tags (0)

participants (2)

Mike
Ross S. W. Walker