[CentOS] Server spontaneously rebooting under RHEL-4
Benjamin J. Weiss
benjamin at birdvet.org
Wed Mar 29 12:08:16 UTC 2006
James Olin Oden wrote:
>On 3/28/06, BRUCE STANLEY <bruce.stanley at prodigy.net> wrote:
><snip>
>
>
>>There could even be a simpler reason for this problem.
>>We had a server do this very thing under REHL-3 and it
>>turned out to be hardware related.
>>
>>The servive technicians came in and reset the the memory, CPU,
>>replaced the CPU fan, and reset the bios.
>>
>>
>>
>One thing that I have seen occur more often with 2.6 kernels is
>catching of MCE's (Machine Check Exceptions). The MCE's are the
>processors way of saying something is extremely wrong that it can
>detect. This typically will cause a panic though not causing a
>reboot. OTH, If your hardware also has support for a watchdog then
>shortly after the panic a reboot would occur.
>
>I'm not saying that this is what is actually happening, but just that
>along the lines of what has been said thus far, this would make sense.
> If indeed this is the case, maybe the panic output is in
>/var/log/messages.
>
>Cheers...james
>
>
Well, so far it looks like something is wrong with our memory
subsystem. I updated all the BIOS' and ran Smart Disk diagnostics. I'm
getting an ECC error on module 4, whether I have RAM in the slot or not!
We're calling HP support, I'm sure we'll be able to get it fixed.
Thanks, all!
Ben
More information about the CentOS
mailing list