[CentOS-virt] Cannot start xen domUs anymore, domUs hang on kernel startup, happens after a long dom0 uptime

Wed Aug 13 22:00:01 UTC 2008
Pasi Kärkkäinen <pasik at iki.fi>

Hello!

I've noticed this problem two times now.. last time I fixed it by rebooting
the (centos 5.1 x86 32b) xen host/dom0.

Symptoms:

- Already running domUs (debian 2.6.18-6-xen-686 32b PAE) continue running
  and working OK 

- Cannot start new domUs (debian 2.6.18-6-xen-686).. kernel bootup just
hangs before running initrd. Same domU with the exact same xen domU cfgfile worked
earlier. 

- This problem starts to happen after a "longer" uptime.. atm uptime for the
dom0-host is 174 days. Cannot say what the actual "limit" is.. last time I
used the dom0 everything worked fine, now it doesn't.. 


"xm list" shows the domain as "-b----" and the Time(s) for the domain does
not increase.. it stays at 1.9 or so. It might increase to 2.0 after 15 mins
or so..  it looks like nothing happens in domU kernel.

"xm console" shows the bootup has stalled.. destroying the domU and
restarting it might show the hang in a bit different place 
(+- a couple of lines). It always hangs before actually executing initrd
image. 

Example console output:

checking if image is initramfs... it is
Freeing initrd memory: 12028k freed
Grant table initialized
NET: Registered protocol family 16
SMP alternatives: switching to SMP code
<hangs here, nothing happens anymore>

Like said, it could be a couple of lines later or earlier where it hangs.. 

I tried changing the domU to use just a single vcpu.. doesn't help. I tried
changing (lowering and increasing) the amount of memory.. didn't help either.

Any ideas how to debug this? 

-- Pasi