After update from centos 6.6 to centos 6.7 and reboot it, I have get a lot of this error into /var/log/messages:
May 3 11:27:20 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=2 RDWR=Read RAS=6093 CAS=896, CE Err=0x10000 (Correctable Patrol Data ECC)) May 3 11:27:21 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=1 RDWR=Read RAS=1330 CAS=4, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC)) May 3 11:27:22 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=2 RDWR=Read RAS=2673 CAS=4, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC)) May 3 11:27:23 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=2 RDWR=Read RAS=1335 CAS=4, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC)) May 3 11:27:24 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=2 RDWR=Read RAS=1335 CAS=4, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC)) May 3 11:27:25 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=2 RDWR=Read RAS=240 CAS=4, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC)) May 3 11:27:26 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=3 RDWR=Read RAS=1796 CAS=900, CE Err=0x10000 (Correctable Patrol Data ECC)) May 3 11:27:27 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=3 RDWR=Read RAS=1337 CAS=4, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC)) May 3 11:27:28 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=3 RDWR=Read RAS=3094 CAS=900, CE Err=0x10000 (Correctable Patrol Data ECC)) May 3 11:27:29 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=3 RDWR=Read RAS=240 CAS=6, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC)) May 3 11:27:30 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=3 RDWR=Read RAS=240 CAS=6, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC))
I have "yum install edac-utils -y" and I have get this output:
[root@s-virt ~]# edac-util -v mc0: 0 Uncorrected Errors with no DIMM info mc0: 0 Corrected Errors with no DIMM info mc0: csrow0: 0 Uncorrected Errors mc0: csrow0: ch0: 0 Corrected Errors mc0: csrow0: ch1: 0 Corrected Errors mc0: csrow0: ch2: 0 Corrected Errors mc0: csrow0: ch3: 0 Corrected Errors mc0: csrow1: 0 Uncorrected Errors mc0: csrow1: ch0: 0 Corrected Errors mc0: csrow1: ch1: 0 Corrected Errors mc0: csrow1: ch2: 0 Corrected Errors mc0: csrow1: ch3: 0 Corrected Errors mc0: csrow2: 0 Uncorrected Errors mc0: csrow2: ch0: 0 Corrected Errors mc0: csrow2: ch1: 80384 Corrected Errors mc0: csrow2: ch2: 0 Corrected Errors mc0: csrow2: ch3: 0 Corrected Errors mc0: csrow3: 0 Uncorrected Errors mc0: csrow3: ch0: 0 Corrected Errors mc0: csrow3: ch1: 8 Corrected Errors mc0: csrow3: ch2: 0 Corrected Errors mc0: csrow3: ch3: 0 Corrected Errors
The server is a:
[root@s-virt ~]# lshw s-virt.dom.it description: Tower Computer product: ProLiant ML370 G5 (433752-421) vendor: HP serial: GBxxxxxxxM width: 64 bits capabilities: smbios-2.4 dmi-2.4 vsyscall64 vsyscall32 configuration: boot=hardware-failure-fw chassis=tower family=ProLiant sku=433752-421 uuid=34333337-3532-4742-3837- 35303557534D
with this RAM installed:
[root@s-virt ~]# dmidecode -t memory|grep Size Size: 1024 MB Size: 2048 MB Size: No Module Installed Size: No Module Installed Size: 1024 MB Size: 2048 MB Size: No Module Installed Size: No Module Installed Size: 1024 MB Size: 4096 MB Size: No Module Installed Size: No Module Installed Size: 1024 MB Size: 4096 MB Size: No Module Installed Size: No Module Installed
I'm not a hardware guru and I do not know "decrypt" log and output commands messages
What is the problem signaled into log?
What I must to do ?
Many thanks for your help.
Il giorno mar, 03/05/2016 alle 12.15 +0200, Dario Lesca ha scritto:
After update from centos 6.6 to centos 6.7 and reboot it, I have get a lot of this error into /var/log/messages:
May 3 11:27:20 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=2 RDWR=Read RAS=6093 CAS=896, CE Err=0x10000 (Correctable Patrol Data ECC))
...
What is the problem signaled into log?
What I must to do ?
Many thanks for your help.
I have found this suggest:
As per logs, you are getting CE (Corrected Error) messages in the system. Ans you can ignore them, Edit grub.conf and add mce=dont_log_ce to the kernel line which will stop corrected error messages to log in file.
But it always good to run memory check in the system.
http://serverfault.com/questions/531110/var-log-messages-showing-lots-of-ce-...
Add mce=dont_log_ce to grub.conf imply a reboot.
It's possible to stop log message without reboot?
Thanks
Tuesday, May 3, 2016, 12:15:21 PM, you wrote:
DL> After update from centos 6.6 to centos 6.7 and reboot it, I have get a DL> lot of this error into /var/log/messages:
May 3 11:27:20 s-virt kernel: EDAC MC0: CE row 2, channel 1, label "": (Branch=0 DRAM-Bank=2 RDWR=Read RAS=6093 CAS=896, CE Err=0x10000 (Correctable Patrol Data ECC))
Hi Dario,
I had a similar case in the past.
I had a brand new server that seemed to be running fine. After a kernel update, I gor lot of error messages in /etc/messages.
That particular kernel update implemented additional error messages that were related to the motherboard chipset. It appears that new boards with new chipset run fine with existing kernel. As soon as kernel updates incorporate special features of these chipsets, you might get such messages.
I returned the board under warranty. The board manufacturer informed me that a memory controller was faulty. The board had the hardware error from the beginning. The new kernel only revealed an already existing problem.
You may want to check first if the hardware problem comes from memory or motherboard.
best regards --- Michael Schumacher