Hi all, After an electric breaking, my server (Centos 5.2 x86_64 with all updates) can not boot. The error message on screen is:
----------------------------------------------------------------------------------------------------------- Memory for crash kernel (0x0 to 0x0) notwithin permissible range <0> HARDWARE ERROR CPU 1: Machine Check Exception: 7 Bank 4: .... RIP 10:<.....> TSC 133eab63c9 ADDR 24fe3d028 This is not a software problem! Run through mcelog --ascii to decode and contact your hardware vendot Kernel panic - not syncing: Uncorrected machine check -------------------------------------------------------------------------
Anyone could tell me how to fix this please ! Help !
Thank you
Vnpenguin wrote:
Hi all, After an electric breaking, my server (Centos 5.2 x86_64 with all updates) can not boot. The error message on screen is:
Memory for crash kernel (0x0 to 0x0) notwithin permissible range <0> HARDWARE ERROR CPU 1: Machine Check Exception: 7 Bank 4: .... RIP 10:<.....> TSC 133eab63c9 ADDR 24fe3d028 This is not a software problem! Run through mcelog --ascii to decode and contact your hardware vendot Kernel panic - not syncing: Uncorrected machine check
Anyone could tell me how to fix this please ! Help !
you have a hardware problem. something fried on the motherboard, possibly the ram, maybe something else.. if the server is on some sort of service contract or warranty, call the hardware or support vendor. if not, find someone skilled at troubleshooting x86_64 server hardware.
I believe the Machine Check Exception: 7 Bank 4 does seem to indicate its a memory ECC issue with DIMM bank 4 on CPU 1 (I'm guessing this is an Opteron system?)
you might try booting a memtest86 CD and seeing if that runs.
Rainer Duffner wrote:
John R Pierce schrieb:
you might try booting a memtest86 CD and seeing if that runs.
Is there a memory-tester that
- isn't i386-only
- and goes beyond 4 GB?
I believe both memtest86 and memtest86+ (a fork) can test over 4GB, likely using PAE, but neither is native 64bit yet.
Rainer Duffner schrieb:
John R Pierce schrieb:
you might try booting a memtest86 CD and seeing if that runs.
Is there a memory-tester that
- isn't i386-only
- and goes beyond 4 GB?
Rainer
In the past I've been testing a 32 GB server mit memtest86+ , and in case of problems it actually found bad RAM all over the range. Whether it uses PAE or not I don't know.
HTH,
Kay
On Fri, Feb 13, 2009 at 3:35 AM, John R Pierce pierce@hogranch.com wrote:
Vnpenguin wrote:
After an electric breaking, my server (Centos 5.2 x86_64 with all updates) can not boot. The error message on screen is: Memory for crash kernel (0x0 to 0x0) notwithin permissible range <0> HARDWARE ERROR CPU 1: Machine Check Exception: 7 Bank 4: .... RIP 10:<.....> TSC 133eab63c9 ADDR 24fe3d028 This is not a software problem! Run through mcelog --ascii to decode and contact your hardware vendot Kernel panic - not syncing: Uncorrected machine check
you have a hardware problem. something fried on the motherboard, possibly the ram, maybe something else.. if the server is on some sort of service contract or warranty, call the hardware or support vendor. if not, find someone skilled at troubleshooting x86_64 server hardware.
<snip> I doubt a warranty will cover electrical damage, but you can ask..... An insurance policy is more likely to cover this. :-) Run Diagnostics on the RAM (which is the reported error) and your motherboard. The PSU may also be damaged. Run Diagnostics on the hard drives, after you get it up and running.
Fixed ! I boot with old kernel and reboot again with new kernel. That's ok now
Thanks