[CentOS] died again

Mon Nov 25 16:54:14 UTC 2013
Les Mikesell <lesmikesell at gmail.com>

On Mon, Nov 25, 2013 at 10:45 AM, Michael Hennebry
<hennebry at web.cs.ndsu.nodak.edu> wrote:
> >>      Keep an eagle eye on dmesg and the logs. If you can, bring
>> machine down and run memtest86 for a few hours (say, when you go to
> I've run the memory test that comes with the Fedora 13 install disk.
> My computer's memory got a clean bill of health.

I've seen a machine where it took 3+ days of running memtest86 to
catch the error.  And then after replacing the RAM, the machine still
crashed occasionally.  Turned out the software RAID1 mirrors had
mismatching contents caused by the bad RAM and even though it would
check clean, sometimes the read would come from the other mirror.
After fixing that, the server has run for years.

But in general, I always suspect power supplies first for mysterious crashes.

   Les Mikesell
     lesmikesell at gmail.com