[CentOS] Server hangs on CentOS 5.5

Tue Mar 8 18:31:34 UTC 2011
Michael Eager <eager at eagerm.com>

Les Mikesell wrote:
> On 3/8/2011 11:24 AM, Michael Eager wrote:
>> Hi --
>>
>> I'm running a server which is usually stable, but every
>> once in a while it hangs.  The server is used as a file
>> store using NFS and to run VMware machines.
>>
>> I don't see anything in /var/log/messages or elsewhere
>> to indicate any problem or offer any clue why the system
>> was hung.
>>
>> Any suggestions where I might look for a clue?
> 
> Probably something hardware related.  Bad memory, overheating, power 
> supply, etc.  I've even seen some rare cases where a bios update would 
> fix it although it didn't make much sense for a machine to run for 
> years, then need a firmware change.

The system is on a UPS and temps seem reasonable.
Locating a transient memory problem is time consuming.
Identifying a power supply which sometimes spikes is
even more difficult.  I'd like to have a clue about the
likely problem before shutting down the server for an
extended period.

I'll set up sar and sensord to periodically log system
status and see if this gives me a clue for the next
time this happens.


-- 
Michael Eager	 eager at eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077