On Wed, Jan 7, 2015 at 9:52 AM, Gordon Messmer gordon.messmer@gmail.com wrote:
Every regular file's directory entry on your system is a hard link. There's nothing particular about links (files) that make a filesystem fragile.
Agreed, although when there are millions, the fsck fixing it is somewhat slow.
It is mostly on aging hardware, so it is possible that there are underlying controller issues. I also see some rare cases on similar machines where a filesystem will go read-only with some scsi errors logged, but didn't look for that yet in this case.
It's probably a similar cause in all cases. I don't know how many times I've seen you on this list defending running old hardware / obsolete hardware. Corruption and failure are more or less what I'd expect if your hardware is junk.
Not junk - these are mostly IBM 3550/3650 boxes - pretty much top of the line in their day (before the M2/3/4 versions), They have Adaptec raid contollers, SAS drives, mostly configured as RAID1 mirrors. I realize that hardware isn't perfect and this is not happening on a large percentage of them. But, I don't see anything that looks like scsi errors in this log and I'm surprised that after running apparently error-free there would be problems detected after a software reboot.
I think the newer M2 and later models went to a different RAID controller, though. Maybe there was a reason.