New subject: CentOS 6.x, kernel-2.6.32-220.7.1, EC2 and drive enumeration

16 Apr 2012


      Hi all,
Is anyone successfully running/has succesfully upgraded to 2.6.32-220
from, say, 2.6.32-71.29.1? (i.e. done a normal run-of-the-mill yum
update on, say a 6.0 instance all the way up cleanly to 6.2?
Reason I ask is that booting into -220 (and I think also into -131 as
well) results in a kernel panic for me. Some digging around and the new kernel
seems to be enumerating the drives with the wrong minors
An m1.large instance-store instance has root at /dev/xvda1, and 2
ephemeral drives xvdb and xvdc. On reboot into 2.6.32-220.7.1 this is
what the kernel reports this:
dracut: dracut-004-256.el6
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.22.6-ioctl (2011-10-19) initialised: dm-devel@redhat.com
udev: starting version 147
dracut: Starting plymouth daemon
%Gxlblk_init: register_blkdev major: 202 
blkfront: xvde1: barriers disabled
blkfront: xvdf: barriers disabled
 xvdf: unknown partition table
%Gblkfront: xvdg: barriers disabled
 xvdg: unknown partition table
dracut Warning: No root device "block:/dev/xvda1" found
%G%G
dracut Warning: Boot has failed. To debug this issue add "rdshell" to the kernel command line.
dracut Warning: Signal caught!
dracut Warning: Boot has failed. To debug this issue add "rdshell" to the kernel command line.
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: init Not tainted 2.6.32-220.7.1.el6.x86_64 #1
Call Trace:
 [<ffffffff814ec3fa>] ? panic+0x78/0x143
 [<ffffffff810074db>] ? __raw_callee_save_xen_irq_enable+0x11/0x26
 [<ffffffff8106ed72>] ? do_exit+0x852/0x860
 [<ffffffff81177f75>] ? fput+0x25/0x30
 [<ffffffff8106edd8>] ? do_group_exit+0x58/0xd0
 [<ffffffff8106ee67>] ? sys_exit_group+0x17/0x20
 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Altering the grub menu.1st and fstab and the instance will boot
from /dev/xvde1 but obviously the device change is fairly fundamental.
The image that this instance is booting was built in a 6.0 chroot and I
have about 100 of them successfully running; but this innocuous update
breaks things in a major way.
I've had a poke around the upstream bugzilla (e.g.
https://bugzilla.redhat.com/show_bug.cgi?id=771912)  as well as the EC2
forums and there are a couple of similar but not-quite-the-same
problems with the newer kernel.
Does anyone have any similar experience or advice?
Cheers,
Steph
-- 
Steph Gosling steph@chuci.org