[CentOS] mce error

Wed Nov 14 02:58:42 UTC 2012
Ted Miller <tedlists at sbcglobal.net>

On 11/13/2012 09:21 AM, Johnny Hughes wrote:
> On 11/13/2012 07:49 AM, Banyan He wrote:
>> Just check the config to build the edac_mce module if you don't build it in.
>>
>> CONFIG_EDAC_MCE=y
>>
>> Make sure you have this in the /boot/config-xxxx.
>
> If he is running a standard CentOS kernel then he should have
> CONFIG_EDAC_MCE=y.
>
>>
>>
>> On 2012-11-13 8:12 PM, Ted Miller wrote:
>>> During booting of Centos6 I see an error message that goes something like:
>>>
>>> Starting mcelog daemon                                     [FAILED]
>>> AMD Processor family 15: Please load edac_mce_amd module.
>>> CPU is unsupported
>>>
>>> The only helpful information I have found is in the "preview" of
>>> https://access.redhat.com/knowledge/solutions/158503.  I don't have a
>>> RedHat account, so don't know if they have a real solution.
>>>
>>> I know that mce has to do with logging certain microprocessor errors.
>>>
>>> 1. How important is this
>>> 2. Is there anything I should do, except wait for a bug fix sometime?
>>>
>>> Ted Miller
>>> Elkhart, IN
>
> What is does this command say:
>
> uname -r

Install is 100% stock, off Minimal Install disk, then added groups for 
Desktop.  Up to date.

    [tmiller at office04]$uname -r
    2.6.32-279.14.1.el6.x86_64

Then I tried the command the web page has (I see my error during bootup)

    [root at office04 Documents]# /etc/init.d/mcelogd start
    [root at office04 Documents]# /etc/init.d/mcelogd status
    Checking for mcelog
    mcelog is stopped

    [tmiller at office04]$ls /dev/mc*
    /dev/mcelog

so the device does exist

    [root at office04 Documents]# locate edac_mci_amd

returned nothing, but I don't know if it should or not.

I was reading the MAN page, and noticed "See  mcelog  --help for  a list of 
valid CPUs." so I tried it, and it lists:
    Valid CPUs: generic p6old core2 k8 p4 dunnington xeon74xx xeon7400
    xeon5500 xeon5200 xeon5000 xeon5100 xeon3100 xeon3200 core_i7 core_i5
    core_i3 nehalem westmere xeon71xx xeon7100 tulsa intel xeon75xx
    xeon7500 xeon7200 xeon7100 sandybridge sandybridge-ep
All the CPUs I recognize in there are Intel, though I don't know all the 
nicknames.

    cat /proc/cpuinfo

on my system shows (only first of two cores copied)

    processor	: 0
    vendor_id	: AuthenticAMD
    cpu family	: 15
    model		: 35
    model name	: Dual Core AMD Opteron(tm) Processor 180
    stepping	: 2
    cpu MHz		: 1000.000
    cache size	: 1024 KB
    physical id	: 0
    siblings	: 2
    core id		: 0
    cpu cores	: 2
    apicid		: 0
    initial apicid	: 0
    fpu		: yes
    fpu_exception	: yes
    cpuid level	: 1
    wp		: yes
    flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat 
pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 
3dnow rep_good pni lahf_lm cmp_legacy
    bogomips	: 2009.40
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

Not the latest and greatest, and old enough I expected it to be supported 
by now.

Any clues in all this?
Ted Miller