RAID rebuild time and disk utilization....

List overview All Threads
Download

newer

older

ethtool

how to get ASUS USB-N13 802.11n...

Tom Bishop

27 Sep 2010 27 Sep '10

3:15 p.m.

So I'm in the process of building and testing a raid setup and it appeared to take along time to build I came across some settings for setting the min amount of time and that helped but it appears that one of the disks is struggling (100 utilization) vs the other one...I was wondering if anyone else has seen this and if so, is their a solution for it...my 2 disks are 1 Samsung F3 1tb /dev/sdb and 1 Seagate 7200.12 1Tb /dev/sdc...smartctl looks good on both....

sar -dbpqu -P ALL 60 1 Linux 2.6.18-194.11.4.el5 (dpcserver) 09/27/2010

10:09:41 AM CPU %user %nice %system %iowait %steal %idle 10:10:41 AM all 0.03 0.00 1.11 0.00 0.00 98.86 10:10:41 AM 0 0.00 0.00 0.00 0.00 0.00 100.00 10:10:41 AM 1 0.10 0.00 1.25 0.00 0.00 98.65 10:10:41 AM 2 0.00 0.00 3.18 0.02 0.00 96.80 10:10:41 AM 3 0.00 0.00 0.02 0.00 0.00 99.98

10:09:41 AM tps rtps wtps bread/s bwrtn/s 10:10:41 AM 838.27 836.41 1.87 107059.98 32.54

10:09:41 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 10:10:41 AM sda 0.93 0.00 16.27 17.43 0.00 2.12 0.29 0.03 10:10:41 AM sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:10:41 AM sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:10:41 AM sda3 0.93 0.00 16.27 17.43 0.00 2.12 0.29 0.03 10:10:41 AM sdb 209.10 26764.99 0.00 128.00 30.90 148.02 4.78 100.02 10:10:41 AM sdb1 209.10 26764.99 0.00 128.00 30.90 148.02 4.78 100.02 10:10:41 AM sdc 209.10 26764.99 0.00 128.00 0.41 1.94 0.60 12.57 10:10:41 AM sdc1 209.10 26764.99 0.00 128.00 0.41 1.94 0.60 12.57 10:10:41 AM md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

10:09:41 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 10:10:41 AM 0 167 1.02 1.03 1.00

Thanks in advance.....

Attachments:

attachment.html (text/html — 2.4 KB)

Show replies by date

Benjamin Franz

27 Sep 27 Sep

5:19 p.m.

On 09/27/2010 08:15 AM, Tom Bishop wrote:

...

So I'm in the process of building and testing a raid setup and it appeared to take along time to build I came across some settings for setting the min amount of time and that helped but it appears that one of the disks is struggling (100 utilization) vs the other one...I was wondering if anyone else has seen this and if so, is their a solution for it...my 2 disks are 1 Samsung F3 1tb /dev/sdb and 1 Seagate 7200.12 1Tb /dev/sdc...smartctl looks good on both....

[...]

What is the output from 'cat /proc/mdstat'?

-- Benjamin Franz

Les Mikesell

5:24 p.m.

On 9/27/2010 12:19 PM, Benjamin Franz wrote:

...

On 09/27/2010 08:15 AM, Tom Bishop wrote:

...
So I'm in the process of building and testing a raid setup and it appeared to take along time to build I came across some settings for setting the min amount of time and that helped but it appears that one of the disks is struggling (100 utilization) vs the other one...I was wondering if anyone else has seen this and if so, is their a solution for it...my 2 disks are 1 Samsung F3 1tb /dev/sdb and 1 Seagate 7200.12 1Tb /dev/sdc...smartctl looks good on both....

[...]

What is the output from 'cat /proc/mdstat'?

Also, head position/motion is the usual limiting factor with disk performance. Do you have other disk activity that will keep yanking the head on the active drive away from the track currently needed for the rebuild?

-- Les Mikesell lesmikesell@gmail.com

Ross Walker

10:54 p.m.

On Sep 27, 2010, at 1:24 PM, Les Mikesell lesmikesell@gmail.com wrote:

...

On 9/27/2010 12:19 PM, Benjamin Franz wrote:

...
On 09/27/2010 08:15 AM, Tom Bishop wrote:

...
So I'm in the process of building and testing a raid setup and it appeared to take along time to build I came across some settings for setting the min amount of time and that helped but it appears that one of the disks is struggling (100 utilization) vs the other one...I was wondering if anyone else has seen this and if so, is their a solution for it...my 2 disks are 1 Samsung F3 1tb /dev/sdb and 1 Seagate 7200.12 1Tb /dev/sdc...smartctl looks good on both....

[...]

What is the output from 'cat /proc/mdstat'?

Also, head position/motion is the usual limiting factor with disk performance. Do you have other disk activity that will keep yanking the head on the active drive away from the track currently needed for the rebuild?

Also, also, if any of the drives are operating in PIO mode (Legacy Mode) then rebuild times will be horrific, as well as performance afterward. Make sure all drives are in SATA/AHCI mode.

-Ross

PS look at svc_tm in iostat and make sure all drives are performing correctly. Maybe a bad cable or flakey port expander...

Tom Bishop

28 Sep 28 Sep

12:05 a.m.

Thanks ROss, I poured through my dmesg logs and all looks well, things appear fine but i don't think the samsung should be running at 100%...something is not right but it hasn't bit me yet....her are my dmesg logs...

scsi0 : ahci scsi1 : ahci scsi2 : ahci scsi3 : ahci scsi4 : ahci scsi5 : ahci ata1: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6ffd00 irq 225 ata2: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6ffd80 irq 225 ata3: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6ffe00 irq 225 ata4: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6ffe80 irq 225 ata5: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6fff00 irq 225 ata6: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6fff80 irq 225 ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-8: WDC WD3200AAJS-00L7A0, 01.03E01, max UDMA/133 ata1.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32) ata1.00: configured for UDMA/133 ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-8: SAMSUNG HD103SJ, 1AJ10001, max UDMA/133 ata2.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32) ata2.00: configured for UDMA/133 ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ATA-8: ST31000528AS, CC3E, max UDMA/133 ata3.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32) ata3.00: configured for UDMA/133 ata4: SATA link down (SStatus 0 SControl 300) ata5: SATA link down (SStatus 0 SControl 300) ata6: SATA link down (SStatus 0 SControl 300) Vendor: ATA Model: WDC WD3200AAJS-0 Rev: 01.0 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back sda: sda1 sda2 sda3 sd 0:0:0:0: Attached scsi disk sda Vendor: ATA Model: SAMSUNG HD103SJ Rev: 1AJ1 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sdb: 1953525168 512-byte hdwr sectors (1000205 MB) sdb: Write Protect is off sdb: Mode Sense: 00 3a 00 00 SCSI device sdb: drive cache: write back SCSI device sdb: 1953525168 512-byte hdwr sectors (1000205 MB) sdb: Write Protect is off sdb: Mode Sense: 00 3a 00 00 SCSI device sdb: drive cache: write back sdb: sdb1 sd 1:0:0:0: Attached scsi disk sdb Vendor: ATA Model: ST31000528AS Rev: CC3E Type: Direct-Access ANSI SCSI revision: 05 SCSI device sdc: 1953525168 512-byte hdwr sectors (1000205 MB) sdc: Write Protect is off sdc: Mode Sense: 00 3a 00 00 SCSI device sdc: drive cache: write back SCSI device sdc: 1953525168 512-byte hdwr sectors (1000205 MB) sdc: Write Protect is off sdc: Mode Sense: 00 3a 00 00 SCSI device sdc: drive cache: write back sdc: sdc1 sd 2:0:0:0: Attached scsi disk sdc device-mapper: uevent: version 1.0.3 device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised: dm-devel@redhat.com device-mapper: dm-raid45: initialized v0.2594l

Also, smartctl output...

=== START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 0 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 071 071 025 Pre-fail Always - 9011 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 5 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 64 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 5 191 G-Sense_Error_Rate 0x0022 252 252 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 057 053 000 Old_age Always - 43 (Lifetime Min/Max 20/47) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 0 223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 5

On Mon, Sep 27, 2010 at 5:54 PM, Ross Walker rswwalker@gmail.com wrote:

...

On Sep 27, 2010, at 1:24 PM, Les Mikesell lesmikesell@gmail.com wrote:

...
On 9/27/2010 12:19 PM, Benjamin Franz wrote:

...
On 09/27/2010 08:15 AM, Tom Bishop wrote:

...
So I'm in the process of building and testing a raid setup and it appeared to take along time to build I came across some settings for setting the min amount of time and that helped but it appears that one of the disks is struggling (100 utilization) vs the other one...I was wondering if anyone else has seen this and if so, is their a solution for it...my 2 disks are 1 Samsung F3 1tb /dev/sdb and 1 Seagate 7200.12 1Tb /dev/sdc...smartctl looks good on both....

[...]

What is the output from 'cat /proc/mdstat'?

Also, head position/motion is the usual limiting factor with disk performance. Do you have other disk activity that will keep yanking the head on the active drive away from the track currently needed for the rebuild?

Also, also, if any of the drives are operating in PIO mode (Legacy Mode) then rebuild times will be horrific, as well as performance afterward. Make sure all drives are in SATA/AHCI mode.

-Ross

PS look at svc_tm in iostat and make sure all drives are performing correctly. Maybe a bad cable or flakey port expander... _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Tom Bishop

12:16 a.m.

Here are the iostats:

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.15 2.47 0.41 0.82 13.01 26.36 31.97 0.01 6.98 1.01 0.12 sda1 0.02 0.00 0.00 0.00 0.04 0.00 24.50 0.00 5.38 4.82 0.00 sda2 0.01 0.00 0.00 0.00 0.03 0.00 37.79 0.00 6.77 5.85 0.00 sda3 0.12 2.47 0.40 0.82 12.93 26.36 31.98 0.01 6.96 1.01 0.12 sdb 1.48 0.00 315.21 0.01 40533.39 0.75 128.59 26.94 85.45 2.80 88.24 sdb1 1.47 0.00 315.21 0.01 40533.30 0.75 128.59 26.94 85.45 2.80 88.24 sdc 1.04 0.00 315.65 0.01 40533.48 1.09 128.41 1.92 6.07 0.88 27.69 sdc1 1.02 0.00 315.65 0.01 40533.39 1.09 128.41 1.92 6.07 0.88 27.68 md0 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 16.80 0.00 1.00 0.00 142.40 142.40 0.00 1.00 0.60 0.06 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda3 0.00 16.80 0.00 1.00 0.00 142.40 142.40 0.00 1.00 0.60 0.06 sdb 56.40 0.00 189.60 0.00 30822.40 0.00 162.57 17.45 93.23 5.28 100.02 sdb1 56.40 0.00 189.60 0.00 30822.40 0.00 162.57 17.45 93.23 5.28 100.02 sdc 0.00 0.00 239.40 0.00 30643.20 0.00 128.00 0.69 2.89 0.68 16.24 sdc1 0.00 0.00 239.40 0.00 30643.20 0.00 128.00 0.69 2.89 0.68 16.24 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

The samsung is on sdb and almost always 100% and svctm is higher than 2.....

On Mon, Sep 27, 2010 at 7:05 PM, Tom Bishop bishoptf@gmail.com wrote:

...

Thanks ROss, I poured through my dmesg logs and all looks well, things appear fine but i don't think the samsung should be running at 100%...something is not right but it hasn't bit me yet....her are my dmesg logs...

scsi0 : ahci scsi1 : ahci scsi2 : ahci scsi3 : ahci scsi4 : ahci scsi5 : ahci ata1: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6ffd00 irq 225 ata2: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6ffd80 irq 225 ata3: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6ffe00 irq 225 ata4: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6ffe80 irq 225 ata5: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6fff00 irq 225 ata6: SATA max UDMA/133 abar m1024@0xfe6ffc00 port 0xfe6fff80 irq 225 ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-8: WDC WD3200AAJS-00L7A0, 01.03E01, max UDMA/133 ata1.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32) ata1.00: configured for UDMA/133 ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-8: SAMSUNG HD103SJ, 1AJ10001, max UDMA/133 ata2.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32) ata2.00: configured for UDMA/133 ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ATA-8: ST31000528AS, CC3E, max UDMA/133 ata3.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32) ata3.00: configured for UDMA/133 ata4: SATA link down (SStatus 0 SControl 300) ata5: SATA link down (SStatus 0 SControl 300) ata6: SATA link down (SStatus 0 SControl 300) Vendor: ATA Model: WDC WD3200AAJS-0 Rev: 01.0 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back sda: sda1 sda2 sda3 sd 0:0:0:0: Attached scsi disk sda Vendor: ATA Model: SAMSUNG HD103SJ Rev: 1AJ1 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sdb: 1953525168 512-byte hdwr sectors (1000205 MB) sdb: Write Protect is off sdb: Mode Sense: 00 3a 00 00 SCSI device sdb: drive cache: write back SCSI device sdb: 1953525168 512-byte hdwr sectors (1000205 MB) sdb: Write Protect is off sdb: Mode Sense: 00 3a 00 00 SCSI device sdb: drive cache: write back sdb: sdb1 sd 1:0:0:0: Attached scsi disk sdb Vendor: ATA Model: ST31000528AS Rev: CC3E Type: Direct-Access ANSI SCSI revision: 05 SCSI device sdc: 1953525168 512-byte hdwr sectors (1000205 MB) sdc: Write Protect is off sdc: Mode Sense: 00 3a 00 00 SCSI device sdc: drive cache: write back SCSI device sdc: 1953525168 512-byte hdwr sectors (1000205 MB) sdc: Write Protect is off sdc: Mode Sense: 00 3a 00 00 SCSI device sdc: drive cache: write back sdc: sdc1 sd 2:0:0:0: Attached scsi disk sdc device-mapper: uevent: version 1.0.3 device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised: dm-devel@redhat.com device-mapper: dm-raid45: initialized v0.2594l

Also, smartctl output...

smartctl -A /dev/sdb smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 0 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 071 071 025 Pre-fail Always - 9011 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 5 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 64 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 5 191 G-Sense_Error_Rate 0x0022 252 252 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 057 053 000 Old_age Always - 43 (Lifetime Min/Max 20/47) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 0 223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 5

On Mon, Sep 27, 2010 at 5:54 PM, Ross Walker rswwalker@gmail.com wrote:

...
On Sep 27, 2010, at 1:24 PM, Les Mikesell lesmikesell@gmail.com wrote:

...
On 9/27/2010 12:19 PM, Benjamin Franz wrote:

...
On 09/27/2010 08:15 AM, Tom Bishop wrote:

...
So I'm in the process of building and testing a raid setup and it appeared to take along time to build I came across some settings for setting the min amount of time and that helped but it appears that one of the disks is struggling (100 utilization) vs the other one...I was wondering if anyone else has seen this and if so, is their a solution for it...my 2 disks are 1 Samsung F3 1tb /dev/sdb and 1 Seagate 7200.12 1Tb /dev/sdc...smartctl looks good on both....

[...]

What is the output from 'cat /proc/mdstat'?

Also, head position/motion is the usual limiting factor with disk performance. Do you have other disk activity that will keep yanking the head on the active drive away from the track currently needed for the rebuild?

Also, also, if any of the drives are operating in PIO mode (Legacy Mode) then rebuild times will be horrific, as well as performance afterward. Make sure all drives are in SATA/AHCI mode.

-Ross

PS look at svc_tm in iostat and make sure all drives are performing correctly. Maybe a bad cable or flakey port expander... _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Ross Walker

1:54 a.m.

On Sep 27, 2010, at 8:16 PM, Tom Bishop bishoptf@gmail.com wrote:

...

Here are the iostats:

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.15 2.47 0.41 0.82 13.01 26.36 31.97 0.01 6.98 1.01 0.12 sda1 0.02 0.00 0.00 0.00 0.04 0.00 24.50 0.00 5.38 4.82 0.00 sda2 0.01 0.00 0.00 0.00 0.03 0.00 37.79 0.00 6.77 5.85 0.00 sda3 0.12 2.47 0.40 0.82 12.93 26.36 31.98 0.01 6.96 1.01 0.12 sdb 1.48 0.00 315.21 0.01 40533.39 0.75 128.59 26.94 85.45 2.80 88.24 sdb1 1.47 0.00 315.21 0.01 40533.30 0.75 128.59 26.94 85.45 2.80 88.24

Average queue size of 26.94 requests, average wait time of 85.45ms, service time of 2.8ms ain't bad, but means the sequential IO is randomizing and backing up the IO.

Chances are this is probably a 4k sector drive and the partition's alignment crosses a 4k page causing double reads. Better to start partitions on sector 2048 instead of 63.

Am I correct on these?

If so I'd break the RAID re-partition and resilver it.

-Ross

Ross Walker

1:59 a.m.

On Sep 27, 2010, at 9:54 PM, Ross Walker rswwalker@gmail.com wrote:

...

On Sep 27, 2010, at 8:16 PM, Tom Bishop bishoptf@gmail.com wrote:

...
Here are the iostats:

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.15 2.47 0.41 0.82 13.01 26.36 31.97 0.01 6.98 1.01 0.12 sda1 0.02 0.00 0.00 0.00 0.04 0.00 24.50 0.00 5.38 4.82 0.00 sda2 0.01 0.00 0.00 0.00 0.03 0.00 37.79 0.00 6.77 5.85 0.00 sda3 0.12 2.47 0.40 0.82 12.93 26.36 31.98 0.01 6.96 1.01 0.12 sdb 1.48 0.00 315.21 0.01 40533.39 0.75 128.59 26.94 85.45 2.80 88.24 sdb1 1.47 0.00 315.21 0.01 40533.30 0.75 128.59 26.94 85.45 2.80 88.24

Average queue size of 26.94 requests, average wait time of 85.45ms, service time of 2.8ms ain't bad, but means the sequential IO is randomizing and backing up the IO.

Chances are this is probably a 4k sector drive and the partition's alignment crosses a 4k page causing double reads. Better to start partitions on sector 2048 instead of 63.

Am I correct on these?

If so I'd break the RAID re-partition and resilver it.

I was wrong about the sector size, it's regular 512 byte sectors.

It still makes sense to look at the partition offset, but I would also look at the cabling too.

-Ross

Tom Bishop

2:49 a.m.

How do I figure out if it's a 4k sector drive, I've read about that but never looked into it...is there any way to tell, and when you mean start my partition, I only have one large partition, since this is just for my data files....so you mean I should start on 2048 and go up from there??? Thanks in advance...going to do some more reading...

Here is the link to the samsung.... http://www.samsung.com/global/business/hdd/productmodel.do?group=72&type...

talks about 512B per sector....which would be 4096....unless they changed something...

So should I break it and change the partitions and if so do I do it on both of the disks so they are the same??? Thanks in advance...going to do some more reading...

On Mon, Sep 27, 2010 at 8:54 PM, Ross Walker rswwalker@gmail.com wrote:

...

On Sep 27, 2010, at 8:16 PM, Tom Bishop bishoptf@gmail.com wrote:

...
Here are the iostats:

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz

avgqu-sz await svctm %util

...
sda 0.15 2.47 0.41 0.82 13.01 26.36 31.97

0.01 6.98 1.01 0.12

...
sda1 0.02 0.00 0.00 0.00 0.04 0.00 24.50

0.00 5.38 4.82 0.00

...
sda2 0.01 0.00 0.00 0.00 0.03 0.00 37.79

0.00 6.77 5.85 0.00

...
sda3 0.12 2.47 0.40 0.82 12.93 26.36 31.98

0.01 6.96 1.01 0.12

...
sdb 1.48 0.00 315.21 0.01 40533.39 0.75 128.59

26.94 85.45 2.80 88.24

...
sdb1 1.47 0.00 315.21 0.01 40533.30 0.75 128.59

26.94 85.45 2.80 88.24

Average queue size of 26.94 requests, average wait time of 85.45ms, service time of 2.8ms ain't bad, but means the sequential IO is randomizing and backing up the IO.

Chances are this is probably a 4k sector drive and the partition's alignment crosses a 4k page causing double reads. Better to start partitions on sector 2048 instead of 63.

Am I correct on these?

If so I'd break the RAID re-partition and resilver it.

-Ross

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

John R Pierce

4:40 a.m.

On 09/27/10 7:49 PM, Tom Bishop wrote:

...

How do I figure out if it's a 4k sector drive, I've read about that but never looked into it...is there any way to tell, and when you mean start my partition, I only have one large partition, since this is just for my data files....so you mean I should start on 2048 and go up from there??? Thanks in advance...going to do some more reading...

a lot of new drives are using 4K byte sectors internally for various technical reasons. they pretend they have 512 byte sectors externally for compatibility

by default, the first few blocks of the disk is the MBR, then the first partition starts right after that. if you dont do anything special about this, odds are, 'right after that' is not on a 4K boundary. you need to 'trim' the start position of the each partition so its on a 4K boundary.

Ross Walker

2:10 p.m.

On Sep 28, 2010, at 12:40 AM, John R Pierce pierce@hogranch.com wrote:

...

On 09/27/10 7:49 PM, Tom Bishop wrote:

...
How do I figure out if it's a 4k sector drive, I've read about that but never looked into it...is there any way to tell, and when you mean start my partition, I only have one large partition, since this is just for my data files....so you mean I should start on 2048 and go up from there??? Thanks in advance...going to do some more reading...

a lot of new drives are using 4K byte sectors internally for various technical reasons. they pretend they have 512 byte sectors externally for compatibility

by default, the first few blocks of the disk is the MBR, then the first partition starts right after that. if you dont do anything special about this, odds are, 'right after that' is not on a 4K boundary. you need to 'trim' the start position of the each partition so its on a 4K boundary.

True, traditionally fdisk has made the first partition start on sector 63 (sectors 0-62 holds MBR and maybe grub secondary loader) as that is what DOS did. Sector 63 is 1 sector before the 16th 4k block, thus every read and write will straddle two blocks, sequential IO will suffer since it will have to seek back one for each step forward (if the block isn't in cache), and each write will incur a read.

This can be avoided by manually creating your partition at a given offset and/or manipulating your LVM metadata size so the first extent starts at the proper offset.

Sector 2048 (1MB) was chosen because not only is it on a 4k boundary, but it is also aligned with most RAID chunk sizes, and thus won't straddle two RAID chunks which incurs another penalty.

Windows 2008 and later default to sector 2048 and if you can control the partition offset it's recommended to do the same.

Having said that I don't believe the OP's problem is completely due to misalignment, but a combo of that and hardware problems.

-Ross

Tom Bishop

2:40 p.m.

The samsung model is not a 4K sector drive, although I did tear it down and made the partition changes moving it to start at sector 64 vs 63 and also tried at 2048....when all was said and done, no change in performance. One final step that I did make was to move the drive to a different port along with changing the cable, I moved to the port that the other drive was on and working fine....after doing that it appears the problem follows the drive...I am thinking of swapping out the drive and then going from there....

On Tue, Sep 28, 2010 at 9:10 AM, Ross Walker rswwalker@gmail.com wrote:

...

On Sep 28, 2010, at 12:40 AM, John R Pierce pierce@hogranch.com wrote:

...
On 09/27/10 7:49 PM, Tom Bishop wrote:

...
How do I figure out if it's a 4k sector drive, I've read about that but never looked into it...is there any way to tell, and when you mean start my partition, I only have one large partition, since this is just for my data files....so you mean I should start on 2048 and go up from there??? Thanks in advance...going to do some more reading...

a lot of new drives are using 4K byte sectors internally for various technical reasons. they pretend they have 512 byte sectors externally for compatibility

by default, the first few blocks of the disk is the MBR, then the first partition starts right after that. if you dont do anything special about this, odds are, 'right after that' is not on a 4K boundary. you need to 'trim' the start position of the each partition so its on a 4K boundary.

True, traditionally fdisk has made the first partition start on sector 63 (sectors 0-62 holds MBR and maybe grub secondary loader) as that is what DOS did. Sector 63 is 1 sector before the 16th 4k block, thus every read and write will straddle two blocks, sequential IO will suffer since it will have to seek back one for each step forward (if the block isn't in cache), and each write will incur a read.

This can be avoided by manually creating your partition at a given offset and/or manipulating your LVM metadata size so the first extent starts at the proper offset.

Sector 2048 (1MB) was chosen because not only is it on a 4k boundary, but it is also aligned with most RAID chunk sizes, and thus won't straddle two RAID chunks which incurs another penalty.

Windows 2008 and later default to sector 2048 and if you can control the partition offset it's recommended to do the same.

Having said that I don't believe the OP's problem is completely due to misalignment, but a combo of that and hardware problems.

-Ross

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Les Mikesell

7:11 p.m.

On 9/28/2010 9:40 AM, Tom Bishop wrote:

...

The samsung model is not a 4K sector drive, although I did tear it down and made the partition changes moving it to start at sector 64 vs 63 and also tried at 2048....when all was said and done, no change in performance. One final step that I did make was to move the drive to a different port along with changing the cable, I moved to the port that the other drive was on and working fine....after doing that it appears the problem follows the drive...I am thinking of swapping out the drive and then going from there....

Disks are cheap enough that it's probably not worth the fight to figure out what is wrong with this one. I have some identical Seagate 750G drives that I swap regularly and re-sync for offsite copies and one of them has started to take about twice as long to complete as the others with no indication of problems showing up int the logs.

-- Les Mikesell lesmikesell@gmail.com

5647

Age (days ago)

5648

Last active (days ago)

discuss@lists.centos.org

12 comments

5 participants

tags (0)

participants (5)

Benjamin Franz
John R Pierce
Les Mikesell
Ross Walker
Tom Bishop