On 2/14/2011 9:53 AM, Rob Kampen wrote:
This system was initially commissioned after burn in, in late 2004 - An Intel mb. It started with RH9, then went FC3, then CentOS5. As mentioned the ECC memory has warned me when things are not well and allowed me to take remedial action before anything impacted my data. It still does great work six years later. For some reason, each time I have shifted it, we started getting these errors. It may be accumulated dust and dirt - so I always clean everything while it is down. Re-seating the RAM after cleaning the contacts and blowing out the dust has always worked. So for me, getting a server grade mb with ECC RAM is a great investment and worth the slight extra cost, not to mention that CentOS seems to have the drivers and modules in place for these mb.
I've seen that too, where moving a server would unseat RAM just enough to cause occasional crashes - and the crashes are better than undetected data/file errors. We mostly use IBM servers that have some diagnostic lights in the front to make the problem obvious.