[CentOS] 3Ware 9550SX and latency/system responsiveness

Tue Oct 2 16:16:59 UTC 2007
Simon Banton <centos at web.org.uk>

>What is the recurring performance problem you are seeing?

Pretty much exactly the symptoms described in 
http://bugzilla.kernel.org/show_bug.cgi?id=7372 relating to read 
starvation under heavy write IO causing sluggish system response.

I recently graphed the blocks in/blocks out from vmstat 1 for the 
same test using each of the four IO schedulers (see the PDF attached 
to the article below):

http://community.novacaster.com/showarticle.pl?id=7492

The test was:

dd if=/dev/sda of=/dev/null bs=1M count=4096 &; sleep 5; dd 
if=/dev/zero of=./4G bs=1M count=4096 &

Despite appearances, interactive responsiveness subjectively felt 
better using deadline than cfq - but this is obviously an atypical 
workload and so now I'm focusing on finishing building the machine 
completely so I can try profiling the more typical patterns of 
activity that it'll experience when in use.

I find myself wondering whether the fact that the array looks like a 
single SCSI disk to the OS means that cfq is able to perform better 
in terms of interleaving reads and writes to the card but that some 
side effect of its work is causing the responsiveness issue at the 
same time. Pure speculation on my part - this is way outside my 
experience.

I'm also looking into trying an Areca card instead (avoiding LSI 
because they're cited as having the same issue in the bugzilla 
mentioned above).

S.