[CentOS] reboot - is there a timeout on filesystem flush?

Wed Jan 7 16:33:28 UTC 2015

On Wed, Jan 7, 2015 at 9:52 AM, Gordon Messmer <gordon.messmer at gmail.com> wrote:
>
> Every regular file's directory entry on your system is a hard link. There's
> nothing particular about links (files) that make a filesystem fragile.

Agreed, although when there are millions, the fsck fixing it is somewhat slow.

>> It is mostly on aging hardware, so it
>> is possible that there are underlying controller issues.  I also see
>> some rare cases on similar machines where a filesystem will go
>> read-only with some scsi errors logged, but didn't look for that yet
>> in this case.
>
>
> It's probably a similar cause in all cases.  I don't know how many times
> I've seen you on this list defending running old hardware / obsolete
> hardware.  Corruption and failure are more or less what I'd expect if your
> hardware is junk.

Not junk - these are mostly IBM 3550/3650 boxes - pretty much top of
the line in their day (before the M2/3/4 versions),  They have
Adaptec raid contollers, SAS drives, mostly configured as RAID1
mirrors.  I realize that hardware isn't perfect and this is not
happening on a large percentage of them.   But, I don't see anything
that looks like scsi errors in this log and I'm surprised that after
running apparently error-free there would be problems detected after a
software reboot.

I think the newer M2 and later models went to a different RAID
controller, though.   Maybe there was a reason.

-- 
   Les Mikesell
      lesmikesell at gmail.com