[CentOS] Hard I/O lockup with EL6

Mon Sep 26 19:36:19 UTC 2011
m.roth at 5-cent.us <m.roth at 5-cent.us>

Benjamin Smith wrote:
> I'm trying to figure out why 2 machines have a "hard I/O lock" on the HDD
> when
> running EL6.
>
> I have 4 identical machines, all were stable with EL5. 2 work great with
> EL6, 2 do not. I've checked momtherboard BIOS versions and settings, SAS
> controller BIOS versions and settings, they are the same between the
> working and non-working systems.
>
> When booting a non-working system, it boots straight up to the boot prompt
> (runlevel 3) without issue, and everything works fine. When the machine
> sits idle for a period of time (ranging from 15 minutes or so and up) the
> HDD becomes unreadable/unwritable and the system is useless for any
> purpose and must be hard restarted with a full power cycle - it won't
> even shut down.
<snip>
Not quite grasping at straws here, but a) have you checked
/var/log/message for memory or drive errors? Maybe memtest86? b) diffed
dmesg between working and dying machines?

One more thing: should we assume you were trying to do things, when they
die, from the console? I ask because I note that you're using the e1000e
driver, which was just the subject of a thread here.

         mark