[CentOS] ssacli start rebuild?

Thu Nov 12 02:30:36 UTC 2020
Valeri Galtsev <galtsev at kicp.uchicago.edu>


> On Nov 11, 2020, at 8:04 PM, John Pierce <jhn.pierce at gmail.com> wrote:
> 
> in large raids, I label my disks with the last 4 or 6 digits of the drive
> serial number (or for SAS disks, the WWN).    this is visible via smartctl,
> and I record it with the zpool documentation I keep on each server
> (typically a text file on a cloud drive).  

I get info about software RAID failure from cronjob executing raid-check (coming with mdadm rpm). I can get S/N of failed drive (they are not dead-dead, still one query one) using smartctl, but I am too lazy to have all serial numbers of drives printed and affixed to fronts of drive trays… but so far I see no other way ;-(

Valeri 

>   zpools don't actually care
> WHAT slot a given pool member is in, you can shut the box down, shuffle all
> the disks, boot back up and find them all and put them back in the pool.
> 
> the physical error reports that proceed a drive failure should list the
> drive identification beyond just the /dev/sdX kind of thing, which is
> subject to change if you add more SAS devices.
> 
> I once researched what it would take to implement the drive failure lights
> on the typical brand name server/storage chassis, there's a command for
> manipulating SES devices such as those lights, the catch is figuring out
> the mapping between the drives and lights, its not always evident, so would
> require trial and error.
> 
> 
> 
> On Wed, Nov 11, 2020 at 5:37 PM Valeri Galtsev <galtsev at kicp.uchicago.edu>
> wrote:
> 
>> 
>> 
>>> On Nov 11, 2020, at 6:00 PM, John Pierce <jhn.pierce at gmail.com> wrote:
>>> 
>>> On Wed, Nov 11, 2020 at 3:38 PM Warren Young <warren at etr-usa.com> wrote:
>>> 
>>>> On Nov 11, 2020, at 2:01 PM, hw <hw at gc-24.de> wrote:
>>>>> 
>>>>> I have yet to see software RAID that doesn't kill the performance.
>>>> 
>>>> When was the last time you tried it?
>>>> 
>>>> Why would you expect that a modern 8-core Intel CPU would impede I/O in
>>>> any measureable way as compared to the outdated single-core 32-bit RISC
>> CPU
>>>> typically found on hardware RAID cards?  These are the same CPUs, mind,
>>>> that regularly crunch through TLS 1.3 on line-rate fiber Ethernet
>> links, a
>>>> much tougher task than mediating spinning disk I/O.
>>> 
>>> 
>>> the only 'advantage' hardware raid has is write-back caching.
>> 
>> Just for my information: how do you map failed software RAID drive to
>> physical port of, say, SAS-attached enclosure. I’d love to hot replace
>> failed drives in software RAIDs, have over hundred physical drives attached
>> to a machine. Do not criticize, this is box installed by someone else, I
>> have “inherited” it.To replace I have to query drive serial number, power
>> off the machine and pulling drives one at a time read the labels...
>> 
>> With hardware RAID that is not an issue, I always know which physical port
>> failed drive is in. And I can tell controller to “indicate” specific drive
>> (it blinks respective port LED). Always hot replacing drives in hardware
>> RAIDs, no one ever knows it has been done. And I’d love to deal same way
>> with drives in software RAIDs.
>> 
>> Thanks for advises in advance. And my apologies for “stealing the thread"
>> 
>> Valeri
>> 
>>> with ZFS you can get much the same performance boost out of a small fast
>>> SSD used as a ZIL / SLOG.
>>> 
>>> --
>>> -john r pierce
>>> recycling used bits in santa cruz
>>> _______________________________________________
>>> CentOS mailing list
>>> CentOS at centos.org
>>> https://lists.centos.org/mailman/listinfo/centos
>> 
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> https://lists.centos.org/mailman/listinfo/centos
>> 
> 
> 
> -- 
> -john r pierce
>  recycling used bits in santa cruz
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos