At 12:30 +0200 2/10/07, matthias platzer wrote: > >What I did to work around them was basically switching to XFS for >everything except / (3ware say their cards are fast, but only on >XFS) AND using very low nr_requests for every blockdev on the 3ware >card. Hi Matthias, Thanks for this. In my CentOS 5 tests the nr_requests turned out by default to be 128, rather than the 8192 of CentOS 4.5. I'll have a go at reducing it still further. >If you can, you could also try _not_ putting the system disks on the >3ware card, because additionally the 3ware driver/card gives writes >priority. I've noticed that kicking off a simulataneous pair of dd reads and writes from/to the RAID 1 array indicates that very clearly - only with cfq as the elevator did reads get any kind of look-in. Sadly, I'm not able to separate the system disks off as there's no on-board SATA on the mboard nor any room for inboard disks, the original intention was to provide the resilience of hardware RAID 1 for the entire machine. >People suggested the unresponsive system behaviour is because the >cpu hanging in iowait for writing and then reading the system >binaries won't happen till the writes are done, so the binaries >should be on another io path. Yup, that certainly seems to be what's happening. Wish I had another io path... >All this seem to be symptoms of a very complex issue consisting of >kernel bugs/bad drivers/... and they seem to be worst on a AMD/3ware >Combination. >here is another link: >http://bugzilla.kernel.org/show_bug.cgi?id=7372 Ouch - thanks for that link :-( Looks like I'm screwed big time. S.