[CentOS] OT, hardware: HP smart array drive issue

Fri Jul 10 18:13:14 UTC 2015
Jason Warr <jason at warr.net>

On 7/10/2015 12:49 PM, m.roth at 5-cent.us wrote:
> Jason Warr wrote:
>> On July 10, 2015 11:47:09 AM CDT, m.roth at 5-cent.us wrote:
>>> Hi. Anyone working with these things? I've got a drive in "predictive
>>> failure" on in a RAID5. Now here's the thing: there was an issue
>>> yesterday when I got in, and I wound up power cycling the RAID;
>>> first boot of attached server had issues, and said the controller
>>> had a failure, and a drive had failed, and wouldn't continue
>>> booting; when I gave it the three-finger salute, this time on the
>>> way up, during POST, it noted the controller issue... but the
>>> thing came up, looking like it did a couple of days ago.
>>> Trying to prevent this from happening again, I've decided to replace
>>> the drive that's in predictive failure. The array has a hot spare.
>>> I tried to remove, using hpacucli, it refuses "operation not
>>> permitted", and there doesn't *seem* to be a "mark as failed"
>>> command. *Do* I just yank the drive?
>> Yep, just yank it.  It should start auto rebuilding on the spare.
>> If you didn't have a spare you would pull the suspect drive and replace it
>> with one of equal or greater capacity and it would auto rebuild as well.
>> I have a bunch of them at home and have been using them at work for years.
> Thanks for your quick reply, Jason. I'm used to LSI/MegaRAID/PERCs, where
> you have to fail it, first. Oddity: I had the drive out for more then five
> minutes while getting it out of the sled, putting the new one in, oh, and
> dusting out the slot (gotta do that for all of them, next maintenance
> window), but after I put in the replacement, and used hpacucli to check,
> to my surprise it was rebuilding with the replacement, *not* with the
> spare.
>          mark
It has been a while since I have used a spare but what might have 
happened is the spare went back to being a spare when the real drive was 
replaced.  It seems to me that is the default behavior as a spare can be 
attached to more than one raid group.  That way it keeps your physical 
drive placement consistent.