[CentOS] how to debug hardware lockups?
rudiahlers at gmail.com
Sat Nov 15 23:21:42 UTC 2008
On Sun, Nov 16, 2008 at 1:14 AM, John R Pierce <pierce at hogranch.com> wrote:
> Rudi Ahlers wrote:
>> Well, on a standard CentOS 5.2, /var/log/messages will be the the
>> place to log problems like this, or where else can I get more info?
> tough to write to the disk when the kernel is crashing. ditto the network.
> that leaves VGAs and serial ports, which can be written to by self
> contained emergency-crash routines...
> IIRC, you said this was a Q9something quad core... thats a desktop
> processor... does this server have ECC memory? (I ask, because few desktop
> platforms do, while ECC is fairly standard on servers). Without ECC, the
> system has no way of knowing it read in bad data from the ram, and if the
> bad data happens to be code and that code happens to be in the kernel,
> ka-RASH, without any detection or warning, it leaps off into never-land, and
> you get a kernel fault, almost always resulting in...
> kernel panic
> system halted
> with no additional useful information available. with ECC memory, single
> bit errors get corrected on the fly, and log an ECC error event, while
> double bit errors result in a system halt with a message indicating such.
No, the motherboard doesn't support ECC RAM. The motherboard is a
Intel DG35EC - http://www.intel.com/products/desktop/motherboards/DG35EC/DG35EC-overview.htm
More information about the CentOS