[CentOS] Intel SE7210TP1-E giving memory errors
John R Pierce
pierce at hogranch.com
Wed Dec 7 09:07:55 UTC 2011
On 12/07/11 12:55 AM, John Hodrien wrote:
> In my limited experience, if you can disable ECC in your BIOS, memtest
> is just
> as good at spotting errors on ECC as non-ECC. With ECC enabled,
> you'll need
> seriously messed up ECC before it'll be detected.
except with ECC disabled, the extra 8 ECC bits per 64bit memory word
aren't touched at all.
I'd leave ECC on, and skip running memtest entirely, just run real OS
workloads and let the ECC do the memory test on the fly, as its meant to.
does linux have an ECC scrubber process? 'real' Unix servers (Solaris,
AIX, etc) generally have a background process, sometimes its part of the
Idle process, that does a read/write of every memory location when the
machine is otherwise idle, this catches and fixes soft ECC errors in
otherwise idle memory, which in turn gets logged. Solaris (on Sun Sparc
hardware at least) keeps track of what locations have had bad memory,
and will stop using a memory page entirely (with a logged alert) if
there are too many soft ECC errors in the same area.
--
john r pierce N 37, W 122
santa cruz ca mid-left coast
More information about the CentOS
mailing list