[CentOS-virt] reboot guest on panic

Mon Mar 15 00:50:03 UTC 2010
Norman Gaywood <ngaywood at une.edu.au>

On Sun, Mar 14, 2010 at 06:35:02PM -0600, compdoc wrote:
> >64bit multi-vcpu. The guest is quite heavyweight, 30GB of memory and
> >12vcpu. It's a LTSP server designed to handle lots of graphical logins
> >for computer science students.  This, I guess is not a common workload.
 
> I was wondering if you wouldn't mind describing the hardware
> this runs on?
 

Sure, more detail at:

https://bugzilla.redhat.com/show_bug.cgi?id=550724

this cut'n'pasted from there (comment #13):


This hardware is relatively new, just over 6 months old. The main idea
of the system is to be a development environment for math/comp sci
students. It's setup to deal with up to 60 LTSP (Linux Terminal Server
Project) terminals and nxclient/ssh connections. It replaces a 4 year
old HP server with Fedora 10 that did the same thing. The old HP setup
ran fedora 10, at its end of life, on the bare metal. The new server
was supposed to make use of virtualization.

The hardware of the new dom0 server is an IBM x3850 M2 with 4 Xeon Quad
Core E7330 80W processors, 64GB of memory. Two IBM 73.4GB 2.5in 10K HS
SAS HDD makeup the system storage for dom0.

At the moment we are running Centos 5.4 with the latest kernel I could
find: kernel-xen-2.6.18-186.el5

SAS attached for main storage is an IBM DS3200 with 12 750GB SATA HDD
configured as one large raid 6 drive. We break up the large drive
using LV.

Various attachments of config and dmesg of dom0 to follow.

I see no strange error messages in the dom0 (including
/var/log/messages) except for the:

(XEN) traps.c:1878:d5 Domain attempted WRMSR 000000000000008b from
00000021:00000000 to 00000000:00000000.

reported by "xm dmesg"

We don't use NFS in dom0 and the network around here is pretty much
stable now.

One thing to note. Originally we had hoped to run a fedora kernel as
a dom0. However we struck bug #541615 (Calgary: DMA error on CalIOC2
PHB 0x3) and so were unable to get the attached storage to pass disk
tests. RH enterprise/Centos is rock solid as a dom0 and passes any disk
tests we can throw at it.


-- 
Norman Gaywood, Computer Systems Officer
University of New England, Armidale, NSW 2351, Australia

ngaywood at une.edu.au            Phone: +61 (0)2 6773 3337
http://mcs.une.edu.au/~norm    Fax:   +61 (0)2 6773 3312

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html