Hi! On two different machines, I've been experiencing disk I/O stalls after upgrading to the CentOS 5.4 kernel. Both machines have an LSI 1068E MPT SAS (mptsas) controller connected to a Chenbro CK13601 36-port SAS expander, with one machine having 16 1T WD disks hooked up to it, and the other having a mix of about 20 WD/Seagate/Samsung/Hitachi 1T and 2T disks. When there's a disk I/O stall, all reads and writes to any disk behind the SAS controller/expander just hang for a while (typically for almost exactly eight seconds), so not just the I/O to one particular disk or a subset of the disks. The disks on other (on-board SATA) controllers still pass I/O requests when the SAS I/O stalls. I hacked up the attached (dirty) perl script to demonstrate this effect -- it will read /proc/diskstats in a tight loop, and keep track of which request entered the request queue when, and when it completed, and it will WTF if a request took more than a second. (The same thing can probably be done with blktrace, but I was lazy.) New requests get submitted, but the pending ones fail to complete for a while, and then they all complete at once. This happens on kernel-2.6.18-164.11.1.el5, while reverting to the latest CentOS 5.3 kernel (kernel-2.6.18-128.7.1.el5) makes the issue go away again, i.e. no more stalls. It doesn't seem to matter whether the I/O load is high or not -- the stalls happen even under almost no load at all. Before I dig into this further, has anyone experienced anything similar? A quick google search didn't come up with much. thanks, Lennert #!/usr/bin/perl -w use Time::HiRes qw(clock_gettime CLOCK_MONOTONIC); sub read_stats { local *STATS; my %stats; %stats = (); open STATS, "< /proc/diskstats"; while (<STATS>) { my @fields; chomp; @fields = split; if ($fields[2] =~ /^sd[a-z]$/) { my $head = $fields[3] + $fields[7]; my $tail = $head + $fields[11]; $stats{$fields[2]} = [ $head, $tail ]; } } close STATS; return \%stats; } my %heads; my %queues; %heads = (); %queues = (); while (1) { my $now; my %stats; my $disk; $now = clock_gettime(CLOCK_MONOTONIC); %stats = %{read_stats()}; foreach $disk (sort keys %stats) { my $head; my $tail; next if ($disk eq "sda"); $head = $stats{$disk}->[0]; $tail = $stats{$disk}->[1]; if (not defined $heads{$disk}) { print "new disk $disk\n"; $heads{$disk} = $head; $queues{$disk} = []; } while ($heads{$disk} + scalar(@{$queues{$disk}}) < $tail) { push @{$queues{$disk}}, $now; print "$now: $disk add -> " . scalar(@{$queues{$disk}}) . "\n"; } while ($heads{$disk} < $head) { my $s; $heads{$disk}++; $s = int(1000 * ($now - (shift @{$queues{$disk}}))); if ($s > 1000) { print "WTF on " . localtime(time()) . " for $s ms\n"; } print "$now: $disk remove after $s ms -> " . scalar(@{$queues{$disk}}) . "\n"; } } }