How can I tell wich HDD to swap, when the "cat /proc/mdstat" says one HDD of the RAID1 array has died?
Does the HDD's has some serial numbers, that I can see in "reality", and I can get that number from e.g.: a commands output?
How could I know wich HDD to swap in e.g.: a RAID1 array?
thank you
Eugeneapolinary Ju wrote:
Does the HDD's has some serial numbers, that I can see in "reality", and I can get that number from e.g.: a commands output?
You may be able to get serial#s from hdparm or something..
How could I know wich HDD to swap in e.g.: a RAID1 array?
When you set the array up, yank a drive and note which drive goes offline in the array then label them so you know. If your building many systems of the same type as long as your cabling is the same you should only need to do it once for all systems.
nate
Eugeneapolinary Ju wrote:
How can I tell wich HDD to swap, when the "cat /proc/mdstat" says one HDD of the RAID1 array has died?
Does the HDD's has some serial numbers, that I can see in "reality", and I can get that number from e.g.: a commands output?
How could I know wich HDD to swap in e.g.: a RAID1 array?
for devices that show up as /dev/hdX, try
# hdparm -i /dev/hda
for devices that show up as /dev/sdX,
# cat /proc/scsi/scsi
this last won't show serial numbers, however it will list drive models and SCSI channel/id/lun, which should narrow it down. the first SCSI device listed thats 'Direct-Access' (as opposed to Processor, or something else) is /dev/sda, the 2nd is /dev/sdb
Eugeneapolinary Ju wrote:
How can I tell wich HDD to swap, when the "cat /proc/mdstat" says one HDD of the RAID1 array has died?
Does the HDD's has some serial numbers, that I can see in "reality", and I can get that number from e.g.: a commands output?
How could I know wich HDD to swap in e.g.: a RAID1 array?
thank you
# smartctl -a /dev/sda and so on This will give you the serial # of the working drives. You'll then have to power down and search for the one not in your serial# list. Make sure to have a good backup first.
Eugeneapolinary Ju wrote:
How can I tell wich HDD to swap, when the "cat /proc/mdstat" says one HDD of the RAID1 array has died?
Does the HDD's has some serial numbers, that I can see in "reality", and I can get that number from e.g.: a commands output?
How could I know wich HDD to swap in e.g.: a RAID1 array?
If you can see activity lights, you can 'cat /dev/sd? >/dev/null' to make them busy, one at a time (where ? is a, b, c, etc).
John R Pierce wrote:
Les Mikesell wrote:
If you can see activity lights, you can 'cat /dev/sd? >/dev/null' to make them busy, one at a time (where ? is a, b, c, etc).
hahaha, I've done that, only my version is...
# dd if=/dev/sdX of=/dev/null bs=512
but, same difference
It's unfortunately about the best we've got when device names are assigned more or less randomly. NICs are even worse - we need a command to make the lights blink there too.
At Mon, 02 Nov 2009 18:02:25 -0600 CentOS mailing list centos@centos.org wrote:
John R Pierce wrote:
Les Mikesell wrote:
If you can see activity lights, you can 'cat /dev/sd? >/dev/null' to make them busy, one at a time (where ? is a, b, c, etc).
hahaha, I've done that, only my version is...
# dd if=/dev/sdX of=/dev/null bs=512
but, same difference
It's unfortunately about the best we've got when device names are assigned more or less randomly. NICs are even worse - we need a command to make the lights blink there too.
AH for the days of HSZ70 boxes... There is a command to make a selected disk in the array blink its light...
At Mon, 2 Nov 2009 19:54:30 -0500 Robert Heller heller@deepsoft.com wrote:
At Mon, 02 Nov 2009 18:02:25 -0600 CentOS mailing list centos@centos.org wrote:
John R Pierce wrote:
Les Mikesell wrote:
If you can see activity lights, you can 'cat /dev/sd? >/dev/null' to make them busy, one at a time (where ? is a, b, c, etc).
hahaha, I've done that, only my version is...
# dd if=/dev/sdX of=/dev/null bs=512
but, same difference
It's unfortunately about the best we've got when device names are assigned more or less randomly. NICs are even worse - we need a command to make the lights blink there too.
AH for the days of HSZ70 boxes... There is a command to make a selected disk in the array blink its light...
Here is something that seems to work:
Install sg3_utils and do something like:
sudo /usr/bin/sg_turs -n=5000 /dev/sdX
(sg_turs does a TEST UNIT READY command, in the above case, it does 5000 of them. This is enough to light up the access light for like a second or so.)
On Mon, Nov 2, 2009 at 6:02 PM, Les Mikesell lesmikesell@gmail.com wrote:
It's unfortunately about the best we've got when device names are assigned more or less randomly. NICs are even worse - we need a command to make the lights blink there too.
Not to get too far off topic, but check out 'ethtool -p' I've had luck with this in the past.
-jonathan
On Mon, Nov 2, 2009 at 7:02 PM, Les Mikesell lesmikesell@gmail.com wrote:
John R Pierce wrote:
Les Mikesell wrote:
If you can see activity lights, you can 'cat /dev/sd? >/dev/null' to make them busy, one at a time (where ? is a, b, c, etc).
hahaha, I've done that, only my version is...
# dd if=/dev/sdX of=/dev/null bs=512
but, same difference
It's unfortunately about the best we've got when device names are assigned more or less randomly. NICs are even worse - we need a command to make the lights blink there too.
-- Les Mikesell@gmail.com _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Fully second that. The dd read test is the best way to tell which drive is starting to fail. As soon as one does it is not a bad idea to replace it.
Note that sometimes SMART drive and such technologies fail to detect missing sectors even when the dd test stumples upon them.
Boris.
At Mon, 2 Nov 2009 14:47:23 -0800 (PST) CentOS mailing list centos@centos.org wrote:
How can I tell wich HDD to swap, when the "cat /proc/mdstat" says one HDD of the RAID1 array has died?
Use '/sbin/mdadm --detail /dev/md<mumble>' and note the drive name (/dev/sd<mumble> for SCSI or SATA, and /dev/hd<mumble> for IDE).
Then '/usr/sbin/smartctl -A <drive name from mdadm's listing>'
/usr/sbin/smartctl will give you lots of information. Near the beginning is the serial number.
You *should* be running smartd and you *should* be enabling all of your disks to be monitored with smartd.
Does the HDD's has some serial numbers, that I can see in "reality", and I can get that number from e.g.: a commands output?
Yes. smartctl will list the disk's info, including serial number.
How could I know wich HDD to swap in e.g.: a RAID1 array?
Presumably, you know which disk is connected to which controller port. mdadm will give you the name assigned by the O/S. For IDE this would be
/dev/hda -- primary master /dev/hdb -- primary slave /dev/hdc -- secondary master /dev/hdd -- secondary slave
for SCSI, the disks get a letter a, b, c, d, etc. on a 'first come first served' basis: disks are scaned in controller order, in id order: controller 0, disk 0, disk 1, disk 2 ... controller 1, disk 0, disk 1 ..
ahci SATA treats each SATA port as a separate 'controller', with only one device (0) and the ports are scanned in order.
Some SATA controllers can be (are) set up to behave like IDE (PATA) disks, this maps a pair of SATA ports as a master and slave pair of 'IDE' (PATA) disks. This can make things 'interesting'. The Dell PowerEdge 840 is like that. With the SCSI-layer Linux drive, the mapping becomes (when all four are in use):
Port 0: (primary master) /dev/sda Port 1: (secondary master) /dev/sdc Port 2: (primary slave) /dev/sdb Port 3: (secondary slave) /dev/sdd
Letters are assigned in the order port 0, port 2, port 1, port 3, with unoccupied ports being skipped.
thank you
MIME-Version: 1.0
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Robert Heller wrote:
Use '/sbin/mdadm --detail /dev/md<mumble>' and note the drive name (/dev/sd<mumble> for SCSI or SATA, and /dev/hd<mumble> for IDE).
Then '/usr/sbin/smartctl -A <drive name from mdadm's listing>'
/usr/sbin/smartctl will give you lots of information. Near the beginning is the serial number.
You *should* be running smartd and you *should* be enabling all of your disks to be monitored with smartd.
# smartctl -A /dev/sda smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/
Current Drive Temperature: 27 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 3037412440 Blocks received from initiator = 3755531488 Blocks read from cache and sent to initiator = 861510971 Number of read and write commands whose size <= segment size = 3430618702 Number of read and write commands whose size > segment size = 73308725 Vendor (Seagate/Hitachi) factory information number of hours powered up = 33496.25 number of minutes until next internal SMART test = 12
doesn't seem to say anything about serial to me.
OH! -i is required.
# smartctl -i -A /dev/sda smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/
Device: SEAGATE ST373207LC Version: 0003 Serial number: 3XXXXXXXXXXXXXXXXXXXX Device type: disk Transport protocol: Parallel SCSI (SPI-4) Local Time is: Mon Nov 2 19:12:25 2009 PST Device supports SMART and is Enabled Temperature Warning Enabled Current Drive Temperature: 27 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 3037412672 Blocks received from initiator = 3755693456 Blocks read from cache and sent to initiator = 861510971 Number of read and write commands whose size <= segment size = 3430623987 Number of read and write commands whose size > segment size = 73308751 Vendor (Seagate/Hitachi) factory information number of hours powered up = 33496.28 number of minutes until next internal SMART test = 10
John R Pierce wrote:
Device: SEAGATE ST373207LC Version: 0003 Serial number: 3XXXXXXXXXXXXXXXXXXXX
Please don't tell me you obscured the serial number intentionally? I find it funny that people go out of their way to obscure their MAC addresses, and even non routable IP addresses but obscuring a serial# of a HDD is a whole new level of paranoia. And this is coming from someone who is really paranoid(I inspect and accept/reject each and every HTTP cookie on my browsers for example).
nate