[CentOS] SATA Raid 5 and losing a drive

Andy Green andy at warmcat.com
Tue Apr 11 18:07:37 UTC 2006


Andy Green wrote:
> Joshua Baker-LePain wrote:
> 
>> Did you try doing any I/O to the array?  In my limited experience with 
>> software RAID, it won't notice a drive missing until it tries to do 
>> something with said drive.
> 
> Yes I did do this, I copied a file to the mountpoint and did a sync. 
> Nothing.

Hm Googling around suggests that everyone with SATA raid may be 
experiencing the same lack of warning that their safety net just blew a 
hole through the server farm roof in a bid to reach escape velocity.

''...The error handling is very simple, but at this stage that is an 
advantage. Error handling code anywhere is inevitably both complex and 
sorely under-tested. libata error handling is intentionally simple. 
Positives: Easy to review and verify correctness. Never data corruption. 
Negatives: if an error occurs, libata will simply send the error back 
the block layer. There are limited retries by the block layer, depending 
on the type of error, but there is never a bus reset.

Or in other words: "it's better to stop talking to the disk than 
compound existing problems with further problems."

As Serial ATA matures, and host- and device-side errata become apparent, 
the error handling will be slowly refined. I am planning to work with a 
few (kind!) disk vendors, to obtain special drives/firmwares that allow 
me to inject faults, and otherwise exercise error handling code.

Error handling improvements will almost certainly be required in order 
to implement features such as device hotplug.
...''

http://linux-ata.org/software-status.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4492 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.centos.org/pipermail/centos/attachments/20060411/0c3cf5bc/attachment.bin>


More information about the CentOS mailing list