[CentOS] Re: how to debug hardware lockups?

nate centos at linuxpowered.net
Wed Nov 19 03:03:56 UTC 2008


Rudi Ahlers wrote:

> Sure, I understand that. But then again, on my Dell servers, when I
> have problems, I sit with the same issues. And those expensive
> motherboards doesn't give me anything more than the cheaper ones. In
> fact, when the RAM failed on the Dell's, they were unusable untill I
> could get new RAM from a different supplier. With the cheaper board, I
> drive down to the first PC shop and get new RAM.

I suppose it depends on what dells you have. On the latest 1950 III
systems we have they have moderately good diagnostics similar to
HP systems. The system log tells me what DIMM module is spitting
out errors so I don't need to go through the trouble of narrowing
down which one(s) is bad.

I only started using Dell recently since I started my new job in
March, before that was mostly HP and Supermicro. HP certainly has
great quality stuff though you do generally pay quite a bit more
for it. Depending on what the server is doing would depend if I'd
really push for that level of quality. Certainly anything that
is a single point of failure I would want on a higher quality
system. I'm not sure if Dell's motherboards go so far as to having
diagnostic LEDs on them to point out what part is faulty. HP has
been doing that for a long time now.

The latest HP G5s port the LEDs to the front of the chassis so
you don't even have to open it up or load any software you can
just look at the front and see if a DIMM is going bad or a
voltage regulator, or a PSU, or a CPU etc. Earlier systems just
had a generic health LED, which would say good/degraded/bad. But
it couldn't give any information as to what was causing the
problem.

Granted not as useful for a remote location if nobody is on
site to look at the LEDs, though for many smaller places that
actually do have people on site on a regular basis it's real
handy.

nate




More information about the CentOS mailing list