Vladimir Budnev wrote: > Hello community. > > We are running, Centos 4.8 on SuperMicro SYS-6026T-3RF with 2xIntel Xeon > E5630 and 8xKingston KVR1333D3D4R9S/4G > > For some time we have lots of MCE in mcelog and we cant find out the > reason. The only thing that shows there (when it shows, since sometimes it doesn't seem to) is a hardware error. You *WILL* be replacing hardware, sometime soon, like yesterday. "Normal" is not: *ANYTHING* here is Bad News. First, you've got DIMMs failing. CPU 53, assuming this system doesn't have 53+ physical CPUs, means that you have x-core systems, so you need to divide by x, so that if it's a 12-core system with 6 physical chips, that would make it DIMM 8 associated with that physical CPU. <snip> > One more interesting thins is the following output: > [root at zuno]# cat /var/log/mcelog |grep CPU|sort|awk '{print $2}'|uniq > 32 > 33 > 34 > 35 > 50 > 51 > 52 > 53 > > Those numbers are always the same. Bad news: you have *two* DIMMs failing, one associated with the physical CPU that has core 53, and another associated with the physical CPU that has cores 32-35. Talk to your OEM support to help identify which banks need replacing, and/or find a motherboard diagram. mark, who has to deal *again* with one machine with the same problem.... _______________________________________________ CentOS mailing list CentOS at centos.org http://lists.centos.org/mailman/listinfo/centos