Dear CentOS community, Can someone give me clues as to whether my memory is going bad or I am having problem with the actual board. Thank you in advace.
I am getting the following error via stdout and also in /var/log/messages
Aug 15 20:37:10 saturn kernel: Northbridge Error, node 0 Aug 15 20:37:10 saturn kernel: ECC/ChipKill ECC error. Aug 15 20:37:10 saturn kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x1b9e740 Aug 15 20:37:10 saturn kernel: EDAC MC0: CE page 0x1b9e, offset 0x740, grain 0, syndrome 0x1cc8, row 2, channel 0, label "": amd64_edac Aug 15 20:37:10 saturn kernel: EDAC MC0: CE - no information available: amd64_edacError Overflow Aug 15 23:33:41 saturn kernel: Northbridge Error, node 0 Aug 15 23:33:41 saturn kernel: ECC/ChipKill ECC error. Aug 15 23:33:41 saturn kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x1098d00 Aug 15 23:33:41 saturn kernel: EDAC MC0: CE page 0x1098, offset 0xd00, grain 0, syndrome 0x976f, row 2, channel 0, label "": amd64_edac Aug 15 23:33:41 saturn kernel: EDAC MC0: CE - no information available: amd64_edacError Overflow Aug 16 02:56:30 saturn kernel: Northbridge Error, node 1 Aug 16 02:56:30 saturn kernel: ECC/ChipKill ECC error. Aug 16 02:56:30 saturn kernel: EDAC amd64 MC1: CE ERROR_ADDRESS= 0x80bd9cc00 Aug 16 02:56:30 saturn kernel: EDAC MC1: CE page 0x80bd9c, offset 0xc00, grain 0, syndrome 0xe08f, row 3, channel 0, label "": amd64_edac Aug 16 02:56:30 saturn kernel: EDAC MC1: CE - no information available: amd64_edacError Overflow Aug 17 02:17:02 saturn kernel: Northbridge Error, node 0 Aug 17 02:17:02 saturn kernel: ECC/ChipKill ECC error. Aug 17 02:17:02 saturn kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x1e25fd0 Aug 17 02:17:02 saturn kernel: EDAC MC0: CE page 0x1e25, offset 0xfd0, grain 0, syndrome 0x1cc8, row 2, channel 0, label "": amd64_edac Aug 17 02:17:02 saturn kernel: EDAC MC0: CE - no information available: amd64_edacError Overflow Aug 17 02:41:22 saturn kernel: Northbridge Error, node 1 Aug 17 02:41:22 saturn kernel: ECC/ChipKill ECC error. Aug 17 02:41:22 saturn kernel: EDAC amd64 MC1: CE ERROR_ADDRESS= 0x80d2ce600 Aug 17 02:41:22 saturn kernel: EDAC MC1: CE page 0x80d2ce, offset 0x600, grain 0, syndrome 0xe08f, row 3, channel 0, label "": amd64_edac Aug 17 02:41:22 saturn kernel: EDAC MC1: CE - no information available: amd64_edacError Overflow Aug 17 04:07:16 saturn kernel: Northbridge Error, node 0 Aug 17 04:07:16 saturn kernel: ECC/ChipKill ECC error. Aug 17 04:07:16 saturn kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x41fe79200 Aug 17 04:07:16 saturn kernel: EDAC MC0: CE page 0x41fe79, offset 0x200, grain 0, syndrome 0xa612, row 3, channel 0, label "": amd64_edac Aug 17 04:07:16 saturn kernel: EDAC MC0: CE - no information available: amd64_edacError Overflow
From: Lisandro Grullon lgrullon@CityTech.Cuny.Edu
Can someone give me clues as to whether my memory is going bad or I am having problem with the actual board. Thank you in advace.
Any led on the motherboard (even better if next to a RAM slot)? Usually, the best (if you can) is to swap RAM modules. If the error follows the RAM module; it is a module problem. If the error stays at the same position, it is the motherboard.
JD
Thank you john, I surely hope that shifting RAM around would fix the issue...this board is extremely expensive to change...about 2K the board along.
John Doe jdmls@yahoo.com 8/17/2011 8:54 AM >>>
From: Lisandro Grullon lgrullon@CityTech.Cuny.Edu
Can someone give me clues as to whether my memory is going bad or I am having problem with the actual board. Thank you in advace.
Any led on the motherboard (even better if next to a RAM slot)? Usually, the best (if you can) is to swap RAM modules. If the error follows the RAM module; it is a module problem. If the error stays at the same position, it is the motherboard.
JD
_______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Wed, 17 Aug 2011, John Doe wrote:
To: CentOS mailing list centos@centos.org From: John Doe jdmls@yahoo.com Subject: Re: [CentOS] Strange Kernel Warning.
From: Lisandro Grullon lgrullon@CityTech.Cuny.Edu
Can someone give me clues as to whether my memory is going bad or I am having problem with the actual board. Thank you in advace.
Any led on the motherboard (even better if next to a RAM slot)? Usually, the best (if you can) is to swap RAM modules. If the error follows the RAM module; it is a module problem. If the error stays at the same position, it is the motherboard.
Another option is to take all the memory out, and then put one module at a time back onto the motherboard, and test that with memtest86+. Should be on the install CD/DVD.
Once you have identified a memory module that tests without errors, try that in each of the other slots if possible.
If there are no errors in any of the other slots, then you will need to test the other memory modules in a slot you know works OK, to see if it's one of the other memory modules that is faulty.
It's also possible to have high density memory and low density memory that works and tests OK individually on a motherboard. However when high density and low density modules are mixed together at the same time in a system, then you might find errors occuring.
I had this on a Centos 5.5 32 bit system, and it was very frustrating to locate the cause of the error. Testing one individual memory module at a time takes the guesswork out of which module may be the faulty one. But NEVER mix HD and LD modules together, as they don't always work well together.
HTH
Keith Roberts
----------------------------------------------------------------- Websites: http://www.karsites.net http://www.php-debuggers.net http://www.raised-from-the-dead.org.uk
All email addresses are challenge-response protected with TMDA [http://tmda.net] -----------------------------------------------------------------
On Wed, 17 Aug 2011 08:17:58 -0400, Lisandro Grullon wrote:
Dear CentOS community, Can someone give me clues as to whether my memory is going bad or I am having problem with the actual board. Thank you in advace.
I am getting the following error via stdout and also in /var/log/messages
Hi,
Please tell us more about your system. (lspci, dmesg and cat /proc/mtrr)
Best regards,
Morten
Sure morten,
lspci reflects the following:
00:00.0 Host bridge: ATI Technologies Inc RD890 Northbridge only dual slot (2x16) PCI-e GFX Hydra part (rev 02) 00:04.0 PCI bridge: ATI Technologies Inc RD890 PCI to PCI bridge (PCI express gpp port D) 00:09.0 PCI bridge: ATI Technologies Inc RD890 PCI to PCI bridge (PCI express gpp port H) 00:0b.0 PCI bridge: ATI Technologies Inc RD890 PCI to PCI bridge (NB-SB link) 00:11.0 SATA controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 SATA Controller [IDE mode] 00:12.0 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:12.1 USB Controller: ATI Technologies Inc SB7x0 USB OHCI1 Controller 00:12.2 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:13.0 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:13.1 USB Controller: ATI Technologies Inc SB7x0 USB OHCI1 Controller 00:13.2 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3d) 00:14.1 IDE interface: ATI Technologies Inc SB7x0/SB8x0/SB9x0 IDE Controller 00:14.3 ISA bridge: ATI Technologies Inc SB7x0/SB8x0/SB9x0 LPC host controller 00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge 00:14.5 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB OHCI2 Controller 00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control 00:19.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link Control 00:1a.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration 00:1a.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address Map 00:1a.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller 00:1a.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control 00:1a.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link Control 00:1b.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration 00:1b.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address Map 00:1b.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller 00:1b.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control 00:1b.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link Control 01:09.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 10) 02:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) 03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection 04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
Morten Stevens mstevens@imt-systems.com 8/17/2011 9:11 AM >>>
On Wed, 17 Aug 2011 08:17:58 -0400, Lisandro Grullon wrote:
Dear CentOS community, Can someone give me clues as to whether my memory is going bad or I am having problem with the actual board. Thank you in advace.
I am getting the following error via stdout and also in /var/log/messages
Hi,
Please tell us more about your system. (lspci, dmesg and cat /proc/mtrr)
Best regards,
Morten _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos