[CentOS-devel] disk i/o stalls with mptsas since upgrade to centos 5.4

Lennert Buytenhek

buytenh at wantstofly.org
Thu May 27 17:41:02 UTC 2010

On Tue, Mar 16, 2010 at 08:43:26AM +0100, Lennert Buytenhek wrote:

> On two different machines, I've been experiencing disk I/O stalls after
> upgrading to the CentOS 5.4 kernel.  Both machines have an LSI 1068E
> MPT SAS (mptsas) controller connected to a Chenbro CK13601 36-port SAS
> expander, with one machine having 16 1T WD disks hooked up to it, and
> the other having a mix of about 20 WD/Seagate/Samsung/Hitachi 1T and 2T
> disks.
> When there's a disk I/O stall, all reads and writes to any disk behind
> the SAS controller/expander just hang for a while (typically for almost
> exactly eight seconds), so not just the I/O to one particular disk or a
> subset of the disks.  The disks on other (on-board SATA) controllers
> still pass I/O requests when the SAS I/O stalls.

FWIW, on the first machine mentioned above, I upgraded the system BIOS,
mptsas controller option ROM, and kernel (to the CentOS 5.5 kernel)
all in one go (in an attempt to minimise downtime), and the problem has
so far (after ~1 hour of I/O) not resurfaced yet. 

Since this is a Supermicro i7 board and the second machine mentioned
above has a totally different board, I suspect that the system BIOS
upgrade will not have made a difference.  I'll try to upgrade the
second machine to the CentOS 5.5 kernel soonish and see if that by
itself makes the problem go away -- if not, I'll try upgrading the
option ROM on that machine's mptsas controller as well.

(I tried upgrading the SAS controller's firmware as well, but the LSI
mpt tool refuses to do that, as it complains that the Product ID on
the controller doesn't match "SAS3442E" which is apparently what it
expected to see.)  (This is a Supermicro AOC-USAS-L8i, and the firmware
update files came straight from supermicro's ftp site.)

