On Tuesday 22 June 2010, Eric Deis wrote:
I have recently upgraded to 2.6.18-194.3.1.el5 and within several days the machine crashed with the following error (repeating in mcelog):
I'm guessing the old kernel just didn't notice.
The below MCEs indicate bad hardware. Since the DIMMs are a lot easier to debug I'd suggest you start there (but it could be the systemboard too). Try running with half you DIMMs then the other half.
/Peter
MCE 0 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 2 BANK 8 MISC 41
...