Benjamin Smith wrote:
I'm trying to figure out why 2 machines have a "hard I/O lock" on the HDD when running EL6.
I have 4 identical machines, all were stable with EL5. 2 work great with EL6, 2 do not. I've checked momtherboard BIOS versions and settings, SAS controller BIOS versions and settings, they are the same between the working and non-working systems.
When booting a non-working system, it boots straight up to the boot prompt (runlevel 3) without issue, and everything works fine. When the machine sits idle for a period of time (ranging from 15 minutes or so and up) the HDD becomes unreadable/unwritable and the system is useless for any purpose and must be hard restarted with a full power cycle - it won't even shut down.
<snip> Not quite grasping at straws here, but a) have you checked /var/log/message for memory or drive errors? Maybe memtest86? b) diffed dmesg between working and dying machines?
One more thing: should we assume you were trying to do things, when they die, from the console? I ask because I note that you're using the e1000e driver, which was just the subject of a thread here.
mark