[CentOS] diagnosing strange crash/hang
gordonthree at gmail.com
Mon Jun 18 16:11:26 UTC 2007
This morning I get a call that a server is down. The server in
question is a vmware guest, windows 2003 advanced. The host is vmware
server 1.01, running on centos 4.4 x64 on a poweredge 2950. The
server has 16g of ram and a quadcore cpu, storage is provided by a
perc 5/i, raid 1 across two 146gb sas drives.
I was able to ssh into the host. After trying to ping the guest, and
trying to connect to vmware via the management console, I decided to
restart the vmware service. so I type service vmware restart. it
hung on "shutting down virtual machines". I was able to ctrl-c out,
and decided to manually kill the vmware processes. after killing all
the vmware stuff, I did a service vmware start. I get an error
"cannot touch /etc/vmware/locations: read only file system"
/etc is part of /, which mount claimed was mounted RW
so I try cat /var/log/messages and get nothing
so I tell the machine to reboot (remotely). of course, it doesn't
come back up on its own, so I drive to the location. the machine is
running, but sitting at a black screen. I don't know what state it
was in, so did a forced turn off. turning it back on, it proceeded to
boot normally. it had a slight pause while it ran fsck on / but other
than that, no errors.
the vm's restarted normally, /var/log/messages is back, but has no
entries between June 15 and when I rebooted it the 2nd time on June
any ideas on where I should start looking?
is there some way to read array status from a Perc controller under linux?
any suggestions will be appreciated!
More information about the CentOS