[CentOS-devel] disk i/o stalls with mptsas since upgrade to centos 5.4

On Tue, Mar 16, 2010 at 10:12:52AM -0400, Ross Walker wrote:

> > On two different machines, I've been experiencing disk I/O stalls  
> > after
> > upgrading to the CentOS 5.4 kernel.  Both machines have an LSI 1068E
> > MPT SAS (mptsas) controller connected to a Chenbro CK13601 36-port SAS
> > expander, with one machine having 16 1T WD disks hooked up to it, and
> > the other having a mix of about 20 WD/Seagate/Samsung/Hitachi 1T and  
> > 2T
> > disks.
> >
> > When there's a disk I/O stall, all reads and writes to any disk behind
> > the SAS controller/expander just hang for a while (typically for  
> > almost
> > exactly eight seconds), so not just the I/O to one particular disk  
> > or a
> > subset of the disks.  The disks on other (on-board SATA) controllers
> > still pass I/O requests when the SAS I/O stalls.
> >
> > I hacked up the attached (dirty) perl script to demonstrate this  
> > effect
> > -- it will read /proc/diskstats in a tight loop, and keep track of
> > which request entered the request queue when, and when it completed,  
> > and
> > it will WTF if a request took more than a second.  (The same thing can
> > probably be done with blktrace, but I was lazy.)  New requests get
> > submitted, but the pending ones fail to complete for a while, and then
> > they all complete at once.
> >
> > This happens on kernel-2.6.18-164.11.1.el5, while reverting to the
> > latest CentOS 5.3 kernel (kernel-2.6.18-128.7.1.el5) makes the issue  
> > go
> > away again, i.e. no more stalls.
> >
> > It doesn't seem to matter whether the I/O load is high or not -- the
> > stalls happen even under almost no load at all.
> >
> > Before I dig into this further, has anyone experienced anything  
> > similar?
> > A quick google search didn't come up with much.
> 
> I would use iostat -x and see if there is a disk or group of disks  
> that show abnormal service times and/or utilization.

I/O to all 16 disks stalls simultaneously, for 8 seconds at a time,
and 'iostat -k 1' shows zero kb/s read and written to each of the
disks (sdb - sdq) for the entire interval.

> Are there any errors in the logs?

Nope.

> How are the disks configured? Software raid?

Yes, two 8-disk RAID6 sets -- but that doesn't seem relevant.

> Is the adapter's firmware at the latest revision?

Not sure.  I tried upgrading it but the vendor's firmware updater
won't let me (see other email for details).

> Was .128 kernel running stock drivers?

Yes.

> Is .164 kernel running stock drivers?

Yes.

> (maybe weak-updates from .128 kernel?)

Nope.

> What IO scheduler is this? Default CFQ?

Yes.