[CentOS] How to enable EDAC kernel module for checking ECC memory?

Wed Jun 25 22:08:23 UTC 2014
Lists <lists at benjamindsmith.com>

In order to support ZFS, we upgraded a backups server with a new, ECC 
motherboard. We're running CentOS 6 with ZFS on Linux, recently patched. 
Now, I want to enable EDAC so we can check for memory errors (and maybe 
PCI errors as well) but so far, repeatedly pounding on the Google hasn't 
yielded exactly what I need to do to enable EDAC.

One howto was covering PCI and edac, but "modprobe edac_mc" didn't work. 
Here's some information below, How do I get edac up and running? Many 
howtos cover how to use edac-ctl and edac-util, but none seem to cover 
how to determine what module to load into the kernel.

[root at hume ~]# modprobe edac_mc
FATAL: Module edac_mc not found.
[root at hume ~]# lsmod | grep edac
[root at hume ~]# cat /proc/version
Linux version 2.6.32-431.11.2.el6.x86_64 
(mockbuild at c6b8.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red 
Hat 4.4.7-4) (GCC) ) #1 SMP Tue Mar 25 19:59:55 UTC 2014
[root at hume ~]# modprobe edac_mce
FATAL: Module edac_mce not found.
[root at hume ~]# edac-ctl --mainboard
edac-ctl: mainboard: Supermicro X9SCL/X9SCM
[root at hume ~]# edac-ctl --status
edac-ctl: drivers not loaded.
[root at hume ~]# lsmod
Module                  Size  Used by
ext3                  240013  1
jbd                    80858  1 ext3
nfsd                  309196  13
lockd                  73662  1 nfsd
nfs_acl                 2647  1 nfsd
auth_rpcgss            44949  1 nfsd
sunrpc                262864  20 nfsd,lockd,nfs_acl,auth_rpcgss
exportfs                4236  1 nfsd
bnx2fc                 90539  0
fcoe                   23298  0
libfcoe                56791  2 bnx2fc,fcoe
8021q                  25349  0
garp                    7152  1 8021q
stp                     2218  1 garp
libfc                 108670  3 bnx2fc,fcoe,libfcoe
llc                     5546  2 garp,stp
scsi_transport_fc      55299  3 bnx2fc,fcoe,libfc
scsi_tgt               12077  1 scsi_transport_fc
ipt_REJECT              2351  2
nf_conntrack_ipv4       9506  15
nf_defrag_ipv4          1483  1 nf_conntrack_ipv4
iptable_filter          2793  1
ip_tables              17831  1 iptable_filter
ip6t_REJECT             4628  2
nf_conntrack_ipv6       8337  2
nf_defrag_ipv6         11156  1 nf_conntrack_ipv6
xt_state                1492  17
nf_conntrack           79758  3 nf_conntrack_ipv4,nf_conntrack_ipv6,xt_state
ip6table_filter         2889  1
ip6_tables             18732  1 ip6table_filter
iTCO_wdt                7115  0
iTCO_vendor_support     3056  1 iTCO_wdt
zfs                  1152935  53
zcommon                44698  1 zfs
znvpair                80460  2 zfs,zcommon
zavl                    6925  1 zfs
zunicode              323159  1 zfs
spl                   260832  5 zfs,zcommon,znvpair,zavl,zunicode
zlib_deflate           21629  1 spl
i2c_i801               11359  0
i2c_core               31084  1 i2c_i801
ses                     6475  0
enclosure               8438  1 ses
sg                     29350  0
lpc_ich                12803  0
mfd_core                1895  1 lpc_ich
shpchp                 32778  0
ext4                  374902  3
jbd2                   93427  1 ext4
mbcache                 8193  2 ext3,ext4
raid1                  32045  2
usb_storage            49068  5
sd_mod                 39069  27
crc_t10dif              1541  1 sd_mod
ata_generic             3837  0
pata_acpi               3701  0
pata_jmicron            2813  2
video                  20674  0
output                  2409  1 video
e1000e                267701  0
ptp                     9614  1 e1000e
pps_core               11458  1 ptp
ahci                   42247  8
xhci_hcd              148886  0
dm_mirror              14384  0
dm_region_hash         12085  1 dm_mirror
dm_log                  9930  2 dm_mirror,dm_region_hash
dm_mod                 84209  2 dm_mirror,dm_log
be2iscsi               99578  0
bnx2i                  48096  0
cnic                   57079  2 bnx2fc,bnx2i
uio                    10462  1 cnic
ipv6                  317829  56 
ip6t_REJECT,nf_conntrack_ipv6,nf_defrag_ipv6,cnic
cxgb4i                 28361  0
cxgb4                 104882  1 cxgb4i
cxgb3i                 24491  0
libcxgbi               52202  2 cxgb4i,cxgb3i
cxgb3                 152922  1 cxgb3i
mdio                    4769  1 cxgb3
libiscsi_tcp           17020  3 cxgb4i,cxgb3i,libcxgbi
qla4xxx               257114  0
iscsi_boot_sysfs        9458  2 be2iscsi,qla4xxx
libiscsi               49836  7 
be2iscsi,bnx2i,cxgb4i,cxgb3i,libcxgbi,libiscsi_tcp,qla4xxx
scsi_transport_iscsi    84241  5 be2iscsi,bnx2i,libcxgbi,qla4xxx,libiscsi


[root at hume ~]# rpm -qi edac-utils
Name        : edac-utils                   Relocations: (not relocatable)
Version     : 0.9                               Vendor: CentOS
Release     : 14.el6                        Build Date: Wed 20 Jul 2011 
11:13:34 AM UTC
Install Date: Wed 25 Jun 2014 09:27:40 PM UTC      Build Host: 
c6b6.bsys.dev.centos.org
Group       : System Environment/Base       Source RPM: 
edac-utils-0.9-14.el6.src.rpm
Size        : 78637                            License: GPLv2+
Signature   : RSA/SHA1, Mon 26 Sep 2011 04:17:58 AM UTC, Key ID 
0946fca2c105b9de
Packager    : CentOS BuildSystem <http://bugs.centos.org>
URL         : http://sourceforge.net/projects/edac-utils/
Summary     : Userspace helper for kernel EDAC drivers
Description :
EDAC is the current set of drivers in the Linux kernel that handle
detection of ECC errors from memory controllers for most chipsets
on i386 and x86_64 architectures. This userspace component consists
of an init script which makes sure EDAC drivers and DIMM labels
are loaded at system startup, as well as a library and utility
for reporting current error counts from the EDAC sysfs files.