[CentOS-virt] Xen HVM domU won't start since updates to 2.6.18-164.15.1 kernels

Sun Mar 28 00:55:17 UTC 2010
Aaron Clark <ophidian at ophidian.homeip.net>

I have a dom0 machine running CentOS 5.4 with all the latest updates 
using Xen as my hypervisor. I am using Xen in part because this machine 
was set up prior to KVM being included in RHEL, and in part because 
KVM's network bridging configuration is not nearly as simple as Xen's. 
The dom0 machine is a headless Mac Mini and I do all of my VM management 
via virsh from the command line. I have two hvm domU's:

     * A web server running CentOS 5.4
     * A mail server running Gentoo

Both VM's are backed by LV's on the dom0 but do not use LVM in the domU. 
Both have virtually identical libvirt configurations (differing by 
expected things like name, UUID, NIC MAC, VNC port, etc).

The web server domU (WSdomU hereafter) does not start since applying the 
most recent kernel update (kernel-xen-2.6.18-164.15.1.el5.x86_64 and 
kernel-2.6.18-164.15.1.el5.x86_64 for the dom0 and WSdomU respectively). 
By 'not start' I mean it appears to be running but it does not use an 
CPU cycles, does not bring up a graphical console, and does not respond 
on the network. The WSdomU is listed as no state rather than the normal 
running or blocked in xentop. The mail server domU starts fine and 
functions normally.

Here are the steps I have taken so far that did not solve the problem:

     * Reboot the dom0 to see if things come up on their own
     * Attempted to connect to the WSdomU's graphical (VNC) console from 
the dom0
     * Check xen dmesg on dom0
     * Check xend logs (a cursory viewing did not show anything blatant; 
specific suggestions of things to look for would be appreciated)
     * Shutdown the mail server domU and attempt to start the WSdomU
     * Use kpartx to access the partitions of the domU
           o Tried switching grub to use the previous kernel
     * Check the SELinux labels on backing LV's (they're the same)
     * Set SELinux to permissive and attempt to start the WSdomU
     * Use virsh edit to try tweaking the WSdomU config
     * virsh undefine, reboot, virsh define the WSdomU config
     * Use fdisk on the LV to ensure it has the correct partition layout
     * dd the WSdomU LV to an .img file, copy it to my Fedora desktop 
and run it under KVM (works fine)
     * dd the .img file to a new LV and create a new libvirt config xml 
(fails to start)

Sample config used to attempt a reconfiguration:

<domain type='xen'>
   <name>Webserver</name>
   <os>
     <type>hvm</type>
     <loader>/usr/lib/xen/boot/hvmloader</loader>
     <boot dev='hd'/>
   </os>
   <memory>262144</memory>
   <vcpu>1</vcpu>
   <on_poweroff>destroy</on_poweroff>
   <on_reboot>restart</on_reboot>
   <on_crash>restart</on_crash>
   <features>
     <acpi/>
     <apic/>
     <pae/>
   </features>
   <devices>
     <emulator>/usr/lib64/xen/bin/qemu-dm</emulator>
     <interface type='bridge'>
       <source bridge='xenbr0'/>
       <script path='vif-bridge'/>
     </interface>
     <disk type='block'>
       <source dev='/dev/mapper/SystemsVG-Webserver'/>
       <target dev='hda'/>
     </disk>
     <graphics type='vnc' />
   </devices>
</domain>

What steps should I take next to debug this?

Aaron
-- 
"The goblins are in charge of maintenance?  Why not just set it on fire 
now and call it a day?"
--Whip Tongue, Viashino Technician