Simon Banton wrote:
At 12:30 +0200 2/10/07, matthias platzer wrote:
What I did to work around them was basically switching to XFS for everything except / (3ware say their cards are fast, but only on XFS) AND using very low nr_requests for every blockdev on the 3ware card.
Hi Matthias,
Thanks for this. In my CentOS 5 tests the nr_requests turned out by default to be 128, rather than the 8192 of CentOS 4.5. I'll have a go at reducing it still further.
Yes, the nr_requests should be a realistic reflection of what the card itself can handle. If too high you will see io_waits stack up high.
64 or 128 are good numbers, rarely have I seen a card that can handle a depth larger then 128 (some older scsi cards did 256 I think).
If you can, you could also try _not_ putting the system disks on the 3ware card, because additionally the 3ware driver/card gives writes priority.
I've noticed that kicking off a simulataneous pair of dd reads and writes from/to the RAID 1 array indicates that very clearly - only with cfq as the elevator did reads get any kind of look-in. Sadly, I'm not able to separate the system disks off as there's no on-board SATA on the mboard nor any room for inboard disks, the original intention was to provide the resilience of hardware RAID 1 for the entire machine.
CFQ will give reads a first to the line priority, but this can cause all sorts of negative side effects for a RAID setup, workloads can be such that a read operation is dependant on a write succeeding first, but both were issued in an io overlapping scenario, you can see the problem. If reads are getting starved with your workload you can try 'anticipatory', but if I remember you have BBU write-back cache enabled and this should really limit the impact.
You will always see an impact though, that is just the nature of it.
Writes will beat reads, random will beat sequential, it's the rock, paper, scissors game that all storage systems must play.
People suggested the unresponsive system behaviour is because the cpu hanging in iowait for writing and then reading the system binaries won't happen till the writes are done, so the binaries should be on another io path.
Yup, that certainly seems to be what's happening. Wish I had another io path...
You can have another io path, just add more disks to the 3ware, create another RAID array and locate your application data there.
All this seem to be symptoms of a very complex issue consisting of kernel bugs/bad drivers/... and they seem to be worst on a AMD/3ware Combination. here is another link: http://bugzilla.kernel.org/show_bug.cgi?id=7372
Ouch - thanks for that link :-( Looks like I'm screwed big time.
There is always a way out of any mess (without scraping the whole project).
-Ross
______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.