[CentOS] data corruption on AMD AM2 systems with 4GB of RAM or more

Thu May 31 00:07:56 UTC 2007
Dan Halbert <halbert at bbn.com>

Bart Schaefer wrote:
> Could this be the same bug as
> http://bugs.centos.org/view.php?id=1774
> ??  We are experiencing the exact symptoms described in centos/1774 on
> dual-CPU Opteron system with 16GB RAM.
I don't believe this is the same problem. The symptoms in 1774 are 
kernel panics with CPU ECC errors, if I'm reading it correctly. The 7768 
kernel bug is corrupted 4k blocks in the pagecache, not kernel panics. 
Is this replicated on several machines? You could try booting with 
iommu=soft, which is the 7768 workaround, but I think that's a long 
shot. You might try "noapic", but hardware ECC errors would seem to 
point to a timing problem or bad hardware.