[CentOS] 3Ware 9550SX and latency/system responsiveness
centos at web.org.uk
Fri Sep 21 18:12:47 UTC 2007
>>At 17:34 +0800 14/9/07, Feizhou wrote:
>>>.oh....do you have a BBU for your write cache on your 3ware board?
>>Not installed, but the machine's on a UPS.
>Ugh. The 3ware code will not give OK then until the stuff has hit disk.
Having now installed BBUs, it's made no difference to the underlying
responsiveness problem I'm afraid.
With ports 2 and 3 now configured as RAID 0, with ext3 filesystem and
mounted on /mnt/raidtest, running this bonnie++ command:
bonnie++ -m RA-256_NR-8192 -n 0 -u 0 -r 4096 -s 20480 -f -b -d /mnt/raidtest
(RA- and NR- relate to kernel params for readahead and nr_requests
respectively - the values above are Centos post-installation defaults)
...causes load to climb:
16:36:12 up 13 min, 2 users, load average: 8.77, 4.78, 1.98
... and uninterruptible processes:
ps ax | grep D
PID TTY STAT TIME COMMAND
59 ? D 0:03 [kswapd0]
2159 ? D 0:01 [kjournald]
2923 ? Ds 0:00 syslogd -m 0
4155 ? D 0:00 [pdflush]
4175 ? D 0:00 [pdflush]
4192 ? D 0:00 [pdflush]
4193 ? D 0:00 [pdflush]
4197 ? D 0:00 [pdflush]
4199 ? D 0:00 [pdflush]
4201 pts/1 R+ 0:00 grep D
... plus an Out of Memory kill of sshd. Second time around (logged in
on the console rather than over ssh), it's just the same except it's
hald that happens to get clobbered instead.
Now that the presence or otherwise of a BBU has been ruled out along
with OS, 3ware recommended kernel param tweaks, RAID level, LVM, slot
speed, different but identical-spec hardware (both machine and card),
what's left to try?
I see there's a new firmware version out today (3ware codeset 22.214.171.124
- driver's still at 2.26.05.007 but the fw's updated to from
3.08.02.005 to 3.08.02.007), so I guess I'll update it and push the
whole thing back up the hill for another go.
If there's anyone out there with a 9550SX and a two-disk RAID 1 or
RAID 0 config on CentOS 4.5 who can give the above bonnie++ benchmark
a go (params adjusted for their own installed RAM - I'm benchmarking
using 5x my installed amount) and let me know if they also have the
same responsiveness problem or not, I'd seriously appreciate it.
More information about the CentOS