On 7/12/2016 10:52 AM, m.roth@5-cent.us wrote:
I'll mention it to my manager. However, much more important is finding something that will tell me*which* drive in a RAID just failed so I can replace it....
thats something thats remained a deep dark secret in the linux (and generic unix) world, 'left as an exercise to the reader'. there's no standard for mapping those SAS (or SCSI) backplane lights to specific drives, and my general experience is the lights only work right with brand name systems using their own brand name proprietary raid piles. there's a sas/scsi control command (it escapes me at the moment) which will turn on and off the backplane lights, but there's no standard glue for connecting this to the drive failure events. A quick batch of googling suggests sas2ircu (LSI proprietary?), and ledmon (https://sourceforge.net/projects/ledmon/) are worth investigating.
I've printed labels with the partial WWN of the drives and stick them on each hot swap tray, and identified the failed drive via those. to verify, I'll do something like a dd if=/dev/mdX of=/dev/null bs=65536, to make all the lights of the working drives blink as fast as possible, and verify the one I think I want to replace is the one thats not blinking