Bart Schaefer wrote:
Could this be the same bug as
http://bugs.centos.org/view.php?id=1774
?? We are experiencing the exact symptoms described in centos/1774 on dual-CPU Opteron system with 16GB RAM.
I don't believe this is the same problem. The symptoms in 1774 are kernel panics with CPU ECC errors, if I'm reading it correctly. The 7768 kernel bug is corrupted 4k blocks in the pagecache, not kernel panics. Is this replicated on several machines? You could try booting with iommu=soft, which is the 7768 workaround, but I think that's a long shot. You might try "noapic", but hardware ECC errors would seem to point to a timing problem or bad hardware.