[CentOS] Interpretation of a hardware error
Peter Kjellström
cap at nsc.liu.se
Fri Apr 13 09:42:13 UTC 2012
On Thursday 12 April 2012 13.36.03 m.roth at 5-cent.us wrote:
> Hey, folks,
>
> I've just started seeing
> Apr 12 13:09:59 <server> kernel: [Hardware Error]:
> MC4_STATUS[Over|CE|MiscV|-|AddrV|-|Poison|CECC]: 0xdd0accf2001d011b
> Apr 12 13:09:59 <server> kernel: [Hardware Error]: Northbridge Error (node
> 1, core 1): ECC error in L3 cache tag.
The error message certainly points to the CPU. The fact that the error
happened on cache tag, not cache data further implicates the CPU.
The message is quite specific and I'd say rather trustworthy...
But there's also the possibility that the message is wrong (either something
else went wrong or nothing really went wrong). In my experience hardware fault
error messages are quite unreliable and at the end of the day DIMMs are
magnitudes more likely to fail than CPUs...
/Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.centos.org/pipermail/centos/attachments/20120413/8da75ca9/attachment.sig>
More information about the CentOS
mailing list