[CentOS] 3ware 9650 issues
jlb17 at duke.edu
Sun Jun 22 04:04:47 UTC 2008
I've been having no end of issues with a 3ware 9650SE-24M8 in a server that's
coming on a year old. I've got 24 WDC WD5001ABYS drives (500GB) hooked to it,
running as a single RAID6 w/ a hot spare. These issues boil down to the card
periodically throwing errors like the following:
sd 1:0:0:0: WARNING: (0x06:0x002C): Command (0x8a) timed out, resetting card.
Usually when this happens, it's followed by:
3w-9xxx: scsi1: AEN: INFO (0x04:0x005E): Cache synchronization
On the less pleasant occasions, it's followed by:
scsi1: ERROR: (0x06:0x0036): Response queue (large) empty failed during reset
3w-9xxx: scsi1: ERROR: (0x06:0x002B): Controller reset failed during scsi host
sd 1:0:0:0: scsi: Device offlined - not ready after error recovery
This of course leads to a several hour downtime as the system has to be powered
down (not just rebooted) and then the volume needs to be fscked. I've been back
and forth with both the vendor and (via the vendor) 3ware with this. The card
has been replaced, as well as the whole system. I'm running the latest
firmware and drivers from 3ware.
Have other folks had good luck with this card? What sorts of configs are you
running? I'm in the position of needing more storage, and I'm a bit gun shy on
3ware at the moment...
QB3 Shared Cluster Sysadmin
More information about the CentOS