[CentOS] 3ware disk failure -> hang
Les Mikesell
lesmikesell at gmail.com
Fri Jan 6 22:34:44 UTC 2006
On Fri, 2006-01-06 at 16:24, Bryan J. Smith wrote:
> > Or at least the typical hardware/driver errors aren't
> > fatal.
>
> I think you, and most software RAID users, continue to miss
> the _root_ cause. If you yank a drive out of a system, one
> that is being used _actively_, you are going to get a kernel
> panic. I've seen it on ATA and SCSI. It's _not_ a driver
> issue. It's the fact that you've lost a resource.
>
> The MD code does _not_ handle this. You have to tie into the
> hotplug system for 2.6 to hide the device's status from the
> MD code.
>
> Now maybe some SCSI drivers handle it differently. But it is
> _not_ a driver issue.
I don't understand this distinction. The kernel calls the
driver which talks to the controller. There should be a
timeout around this and the controller's response or
the timeout should be fielded by the driver. How can
it not be a driver issue unless the controller actually
locks the PC bus (which may be the case with the motherboard
IDE controllers - they generally won't boot with a bad drive
either). You don't want to hide the status from the MD code - you
want the md driver to kick the device out when it has problems.
--
Les Mikesell
lesmikesell at gmail.com
More information about the CentOS
mailing list