[CentOS] "EDAC i5000 MC0: FATAL ERRORS Found!!!" error message?

Mon Oct 13 13:45:57 UTC 2008
Tim Verhoeven <tim.verhoeven.be at gmail.com>

On Mon, Oct 13, 2008 at 3:38 PM, Jeff <jpotter-centos at codepuppy.com> wrote:
>
> We had the following error thrown on console on a PowerEdge server running
> CentOS 5 (64 bit). Googling around didn't yield any particular insights. The
> server crashed a few minutes after this message. Running memtester, just to
> check, didn't find anything; and the box has been running for months before
> this without issue.
> I'm wondering if anyone has run across this before, and if so, if it was
> software (CentOS) or hardware (PowerEdge / PowerVault) related?
> Oct  8 12:19:35 someServer kernel: EDAC i5000 MC0: FATAL ERRORS Found!!! 1st
> FATAL Err Reg= 0x4
> Oct  8 12:19:35 someServer kernel: EDAC i5000 MC0: >Tmid Thermal event with
> intelligent throttling disabled
> Oct  8 12:19:35 someServer kernel: EDAC MC0: UE row 1, channel-a= 2
> channel-b= 3 labels "-": (Branch=1 DRAM-Bank=0 RDWR=Write RAS=11802 CAS=0
> FATAL Err=0x4)

IIRC the EDAC i5000 is the memory controller of the server, and it
looks like something went wrong with a DIMM and that is probably why
it crashed. So it looks like you may have a (intermittent) hardware
issue.

Regards,
Tim

-- 
Tim Verhoeven - tim.verhoeven.be at gmail.com - 0479 / 88 11 83

Hoping the problem  magically goes away  by ignoring it is the
"microsoft approach to programming" and should never be allowed.
(Linus Torvalds)