[CentOS] Random server reboot after update to CentOS 5.3

Fri May 22 07:38:56 UTC 2009
Peter Hopfgartner <peter.hopfgartner at r3-gis.com>

William L. Maltby wrote:
> On Thu, 2009-05-21 at 20:07 +0200, Ralph Angenendt wrote:
>   
>> Lanny Marcus wrote:
>>     
>>> It would have been helpful, if the error message told you which system
>>> software.  :-)   The Upgrade from 5.2 to 5.2 seems to have been
>>> problematic for some people on this list.  When you did the Upgrade,
>>> did you follow this sequence, per the CentOS 5.3 Release Notes?
>>> yum clean all && yum update glibc\* && yum update
>>> If not, that may or may not have anything to do with the reboots and
>>> error messages you have received.
>>>       
>> No, it would not have to do anything with the spontaneous reboots. The 
>> problem with glibc only concerns rpm, as the release notes clearly state.
>>     
>
> Since nobody else mentioned it to th OP, ...
>
> Let us not forget that often the hardware chooses to act up around the
> same time that some kind of (software) upgrade is performed. I've wasted
> a lot of time in the past *assuming* that because the hardware was
> rock-solid in the past, it must have been some change (I made) to the
> software.
>
> I suggest running diagnostics, or manually re-seating everything
> (especially if you had occasion to move the unit or open it recently).
>
> Memory used to age, does it still? Memtest*86 might be in order.
>
>   
The machine has ECC RAM and extensive hardware logging build in. It 
should be quite uncommon that the build in server logging would not 
notice a memory failure. The server diagnostics from Dell do not show 
any hardware failure at all, only a software problem. In my experience, 
memory problems lead in most cases to application instability,  but I've 
never seen a reboot like this, that leaves absolutely no trace in the 
Linux logs. Taking the reboot, that happened tonight:

The Dell utility says:

Severity      : Critical
Date and Time : Thu May 21 21:16:16 2009
Description   : System Software event: run-time critical stop was asserted

 From /var/log/messages:

May 21 10:58:38 servernew auditd[18962]: Init complete, auditd 1.7.7 
listening f
or events (startup state enable)
May 21 21:18:58 servernew syslogd 1.4.1: restart.
May 21 21:18:58 servernew kernel: klogd 1.4.1, log source = /proc/kmsg 
started.
May 21 21:18:58 servernew kernel: Bootdata ok (command line is ro 
root=LABEL=/)
May 21 21:18:58 servernew kernel: Linux version 2.6.18-128.1.10.el5xen 
(mockbuil
d at builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) 
#1 SMP T
hu May 7 11:07:18 EDT 2009
May 21 21:18:58 servernew kernel: BIOS-provided physical RAM map:

Reassuming: nothing happens for hours and then, zack!, boom!, the server 
restarts.


Regards,

Peter
>> Ralph
>> <snip sig stuff>
>>     
>
> HTH
>   


-- 
 
Dott. Peter Hopfgartner
 
R3 GIS Srl - GmbH
Via Johann Kravogl-Str. 2
I-39012 Meran/Merano (BZ)
Email: peter.hopfgartner at r3-gis.com
Tel. : +39 0473 494949
Fax  : +39 0473 069902
www  : http://www.r3-gis.com