On Fri, 2006-01-06 at 16:24, Bryan J. Smith wrote: > > Or at least the typical hardware/driver errors aren't > > fatal. > > I think you, and most software RAID users, continue to miss > the _root_ cause. If you yank a drive out of a system, one > that is being used _actively_, you are going to get a kernel > panic. I've seen it on ATA and SCSI. It's _not_ a driver > issue. It's the fact that you've lost a resource. > > The MD code does _not_ handle this. You have to tie into the > hotplug system for 2.6 to hide the device's status from the > MD code. > > Now maybe some SCSI drivers handle it differently. But it is > _not_ a driver issue. I don't understand this distinction. The kernel calls the driver which talks to the controller. There should be a timeout around this and the controller's response or the timeout should be fielded by the driver. How can it not be a driver issue unless the controller actually locks the PC bus (which may be the case with the motherboard IDE controllers - they generally won't boot with a bad drive either). You don't want to hide the status from the MD code - you want the md driver to kick the device out when it has problems. -- Les Mikesell lesmikesell at gmail.com