[CentOS] domU corrupt after server crash, help needed trying to recover domU LVM

Fri May 8 13:43:12 UTC 2009
Rudi Ahlers <rudiahlers at gmail.com>

On Fri, May 8, 2009 at 2:30 PM, Rudi Ahlers <rudiahlers at gmail.com> wrote:

> Hi all,
>
> One of our Dell servers has failed badly, and one of the domU's has been
> corrupted in the process. It boots up to a point and then gives me a kernel
> panic:
>
> Loading dm-zero.ko module
> Loading dm-snapshot.ko module
> Scanning and configuring dmraid supported devices
> Scanning logical volumes
>   Reading all physical volumes.  This may take a while...
>   No volume groups found
> Activating logical volumes
>   Volume group "VolGroup00" not found
> Creating root device.
> Mounting root filesystem.
> mount: could not find filesystem '/dev/root'
> Setting up other filesystems.
> Setting up new root fs
> setuproot: moving /dev failed: No such file or directory
> no fstab.sys, mounting internal defaults
> setuproot: error mounting /proc: No such file or directory
> setuproot: error mounting /sys: No such file or directory
> Switching to new root and running init.
> unmounting old /dev
> unmounting old /proc
> unmounting old /sys
> switchroot: mount failed: No such file or directory
> Kernel panic - not syncing: Attempted to kill init!
>
>
> It shows up as a Zombie:
>
> [root at xen ~]# xm list
> Name                                      ID Mem(MiB) VCPUs State   Time(s)
> Domain-0                                   0     1439     1 r-----    329.0
> Zombie-hfserver2                          15     1024     1 ----c-      0.5
> hfdns02                                   10      519     2 r-----   1552.8
>
>
> I can't mount either:
>
> root at xen ~]# mount /dev/data/hf
> hfdns02    hfserver2
> [root at xen ~]# mount /dev/data/hfserver2 /mnt/cpanel/
> mount: you must specify the filesystem type
> [root at xen ~]# mount -o loop /dev/data/hfserver2 /mnt/cpanel/
> mount: you must specify the filesystem type
>
> Here's the output of the LVM partitions:
>
> [root at xen ~]# lvscan
>   ACTIVE            '/dev/data/cpanel002' [100.00 GB] inherit
>   ACTIVE            '/dev/data/windows2003_web' [30.00 GB] inherit
>   ACTIVE            '/dev/data/storage' [50.00 GB] inherit
>   ACTIVE   Original '/dev/data/hfserver2' [30.00 GB] inherit
>   ACTIVE            '/dev/data/hfdns02' [30.00 GB] inherit
>   ACTIVE            '/dev/data/pluto' [30.00 GB] inherit
>   ACTIVE   Snapshot '/dev/data/pluto_s' [30.00 GB] inherit
>   ACTIVE            '/dev/system/root' [39.06 GB] inherit
>   ACTIVE            '/dev/system/swap' [9.75 GB] inherit
> [root at xen ~]# vgscan
>   Reading all physical volumes.  This may take a while...
>   Found volume group "data" using metadata type lvm2
>   Found volume group "system" using metadata type lvm2
> [root at xen ~]#
>
>
> Does anyone know how to fix a LVM like this?
>
> --
>

 Here's what I've done so far:

[root at xen ~]# losetup /dev/loop4  /dev/data/hfserver2

# This mounts the LVM partition as imaginary physical to /dev/loop4
# data is the virtual group (VG) name

[root at xen ~]#  kpartx -va /dev/loop4
add map loop4p1 : 0 208782 linear /dev/loop4 63
add map loop4p2 : 0 62701695 linear /dev/loop4 208845

# This creates a device map with partitions in /dev/data/hfserver in
/dev/mapper

[root at xen ~]# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "VolGroup00" using metadata type lvm2
  Found volume group "data" using metadata type lvm2
  Found volume group "system" using metadata type lvm2


[root at xen ~]# lvscan
  inactive          '/dev/VolGroup00/LogVol00' [27.94 GB] inherit
  inactive          '/dev/VolGroup00/LogVol01' [1.94 GB] inherit
  ACTIVE            '/dev/data/cpanel002' [100.00 GB] inherit
  ACTIVE            '/dev/data/windows2003_web' [30.00 GB] inherit
  ACTIVE            '/dev/data/storage' [50.00 GB] inherit
  ACTIVE   Original '/dev/data/hfserver2' [30.00 GB] inherit
  ACTIVE            '/dev/data/hfdns02' [30.00 GB] inherit
  ACTIVE            '/dev/data/pluto' [30.00 GB] inherit
  ACTIVE   Snapshot '/dev/data/pluto_s' [30.00 GB] inherit
  ACTIVE            '/dev/system/root' [39.06 GB] inherit
  ACTIVE            '/dev/system/swap' [9.75 GB] inherit

[root at xen ~]# lvchange -ay VolGroup00
[root at xen ~]# lvscan
  ACTIVE            '/dev/VolGroup00/LogVol00' [27.94 GB] inherit
  ACTIVE            '/dev/VolGroup00/LogVol01' [1.94 GB] inherit
  ACTIVE            '/dev/data/cpanel002' [100.00 GB] inherit
  ACTIVE            '/dev/data/windows2003_web' [30.00 GB] inherit
  ACTIVE            '/dev/data/storage' [50.00 GB] inherit
  ACTIVE   Original '/dev/data/hfserver2' [30.00 GB] inherit
  ACTIVE            '/dev/data/hfdns02' [30.00 GB] inherit
  ACTIVE            '/dev/data/pluto' [30.00 GB] inherit
  ACTIVE   Snapshot '/dev/data/pluto_s' [30.00 GB] inherit
  ACTIVE            '/dev/system/root' [39.06 GB] inherit
  ACTIVE            '/dev/system/swap' [9.75 GB] inherit

[root at xen ~]# e2fsck /dev/VolGroup00/LogVol00
e2fsck 1.39 (29-May-2006)
/dev/VolGroup00/LogVol00: clean, 631982/7325696 files, 4512772/7323648
blocks


At first it found a whole lot of damages inodes which I repaired.

Then, I reversed the process:

[root at xen ~]# lvchange -an VolGroup00
[root at xen ~]# lvscan
  inactive          '/dev/VolGroup00/LogVol00' [27.94 GB] inherit
  inactive          '/dev/VolGroup00/LogVol01' [1.94 GB] inherit
  ACTIVE            '/dev/data/cpanel002' [100.00 GB] inherit
  ACTIVE            '/dev/data/windows2003_web' [30.00 GB] inherit
  ACTIVE            '/dev/data/storage' [50.00 GB] inherit
  ACTIVE   Original '/dev/data/hfserver2' [30.00 GB] inherit
  ACTIVE            '/dev/data/hfdns02' [30.00 GB] inherit
  ACTIVE            '/dev/data/pluto' [30.00 GB] inherit
  ACTIVE   Snapshot '/dev/data/pluto_s' [30.00 GB] inherit
  ACTIVE            '/dev/system/root' [39.06 GB] inherit
  ACTIVE            '/dev/system/swap' [9.75 GB] inherit



[root at xen ~]# vgchange -an VolGroup00
  0 logical volume(s) in volume group "VolGroup00" now active
[root at xen ~]# kpartx -d /dev/loop4
[root at xen ~]# losetup -d /dev/loop4
[root at xen ~]#

[root at xen ~]# xm create -c /etc/xen/hfserver2


And then it dies:


 Reading all physical volumes.  This may take a while...
  Found volume group "VolGroup00" using metadata type lvm2
Activating logical volumes
  2 logical volume(s) in volume group "VolGroup00" now active
Creating root device.
Mounting root filesystem.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
Setting up other filesystems.
Setting up new root fs
no fstab.sys, mounting internal defaults
Switching to new root and running init.
unmounting old /dev
unmounting old /proc
unmounting old /sys
exec of init (/sbin/init) failed!!!: No such file or directory
Kernel panic - not syncing: Attempted to kill init!
 [root at xen ~]#


-- 
Kind Regards
Rudi Ahlers
CEO, SoftDux Hosting
Web: http://www.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos/attachments/20090508/2a0d505c/attachment-0005.html>