[CentOS] Hard I/O lockup with EL6

Mon Sep 26 21:00:52 UTC 2011
Brian McKerr <bmckerr at gmail.com>

Have you checked the cables you are using ?


On Tue, Sep 27, 2011 at 6:09 AM, Benjamin Smith <lists at benjamindsmith.com>wrote:

> On Monday, September 26, 2011 12:36:19 PM m.roth at 5-cent.us wrote:
> > a) have you checked
> > /var/log/message for memory or drive errors?
>
> Looked through the logs, there's *nothing* I can find that's out of sorts.
> When
> the IO problem happens, nothing can be written.
>
> > Maybe memtest86?
>
> I replaced all the RAM from working/non-working machines. In several cases
> where replacing RAM resolved the issue,  memtest didn't indicate any
> problems,
> so I'm not inclined to trust it.
>
> > b) diffed
> > dmesg between working and dying machines?
>
> Other than the IRQ difference noted earlier, visual scan revealed no
> differences
> involving mpt2.
>
> >
> > One more thing: should we assume you were trying to do things, when they
> > die, from the console? I ask because I note that you're using the e1000e
> > driver, which was just the subject of a thread here.
>
> I'm familiar with the stale EL6 e1000e driver. I've been using one included
> by
> yum from elrepo. Manually downloaded RPM so that ethernet works before
> doing a
> yum -y update. I've been assuming this was unrelated.
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>