On Wed, Oct 19, 2011 at 2:33 PM, Lamar Owen <lowen at pari.edu> wrote: > On Tuesday, October 18, 2011 01:07:02 PM Les Mikesell wrote: >> I don't think anything is immune to failure. Another fun case is a >> randomly-bad memory bit causing different things to be written to >> software raid mirrors. I had one that took 3+ days of running >> memtest86 to catch. > > ECC RAM? The server said it was one-bit-correcting or something like that. I thought it was supposed to stop if it had errors it couldn't correct. I swapped the whole set out at once without digging much more into the details. -- Les Mikesell lesmikesell at gmail.com