Joshua Baker-LePain <jlb17 at duke.edu> wrote: > I'm running an all software RAID50 ... > This morning I came in to find the system hung. > Turns out a disk went overnight on one of the 7500s, > and rather than a graceful failover I got this: > Jan 6 01:03:58 $SERVER kernel: 3w-xxxx: scsi2: Command > failed: status = 0xc7,flags = 0x40, unit #3. > Jan 6 01:04:02 $SERVER kernel: 3w-xxxx: scsi2: AEN: ERROR: > Drive error: Port #3. > Jan 6 01:04:10 $SERVER 3w-xxxx[2781]: ERROR: Drive error > encountered on port 3 on controller ID:2. Check cables and > drives for media errors. (0xa) Yes, the drive failed. Had you used the 3Ware's intelligent hardware RAID, it would have hidden the drive disconnect from the system. You'd see a log entry on the failure, and that the array was in a "downgraded" state. Instead, you're using software RAID, and it's up to the kernel to not panic on itself because a disk is no longer available. The problem isn't the 3Ware controller, it's the software RAID logic in the kernel. > Any ideas as to what I can do to prevent this in the > future? Use the 3Ware card as it is intended, a hardware RAID card. > Having the system hang every time a disk dies is, well, less > than optimal. No joke. It wasn't until even kernel 2.6 that hotplug support was offered, and it still does _not_ work as advertised. It's stuff like this that makes me want to strangle most advocates of using 3Ware cards with software RAID. There are countless issues like this -- far more than the alleged "hardware lock-in" negative of using hardware RAID. -- Bryan J. Smith Professional, Technical Annoyance b.j.smith at ieee.org http://thebs413.blogspot.com ---------------------------------------------------- *** Speed doesn't kill, difference in speed does ***