variable raid1 rebuild speed?

List overview All Threads
Download

newer

older

How to unpack rar file?

dirty screen

Les Mikesell

12 Jan 2011 12 Jan '11

6:38 p.m.

I have a 750Gb 3-member software raid1 where 2 partitions are always present and the third is regularly rotated and re-synced (SATA disks in hot-swap bays). The timing of the resync seems to be extremely variable recently, taking anywhere from 3 to 10 hours even if the partition is unmounted and the drives aren't doing anything else and regardless of what I echo into /proc/sys/dev/raid/speed_limit_min or _max. Is there some way to tell if the drives are going bad or speed it up consistently if they aren't?

-- Les Mikesell lesmikesell@gmail.com

Show replies by date

JohnS

12 Jan 12 Jan

7:57 p.m.

On Wed, 2011-01-12 at 12:38 -0600, Les Mikesell wrote:

...

I have a 750Gb 3-member software raid1 where 2 partitions are always present and the third is regularly rotated and re-synced (SATA disks in hot-swap bays). The timing of the resync seems to be extremely variable recently, taking anywhere from 3 to 10 hours even if the partition is unmounted and the drives aren't doing anything else and regardless of what I echo into /proc/sys/dev/raid/speed_limit_min or _max. Is there some way to tell if the drives are going bad or speed it up consistently if they aren't?

echo 100000 > /proc/sys/dev/raid/speed_limit_min

echo 100000 > /proc/sys/dev/raid/speed_limit_max

Should equate to around 100MiBs and I suppose bad sectors or punctured block will bring it to a crawl also. Check the disk out with smart. On way to tell is keep a "hdparm" baseline.

John

Kevin Thorpe

13 Jan 13 Jan

11:58 a.m.

On 12/01/2011 18:38, Les Mikesell wrote:

...

I have a 750Gb 3-member software raid1 where 2 partitions are always present and the third is regularly rotated and re-synced (SATA disks in hot-swap bays). The timing of the resync seems to be extremely variable recently, taking anywhere from 3 to 10 hours even if the partition is unmounted and the drives aren't doing anything else and regardless of what I echo into /proc/sys/dev/raid/speed_limit_min or _max. Is there some way to tell if the drives are going bad or speed it up consistently if they aren't?

Ummmm.... RAID is not a backup policy. Fine use hotswap and rsync or similar but you really shouldn't be relying on raid rebuilds for that.

Les Mikesell

1:56 p.m.

On 1/13/11 5:58 AM, Kevin Thorpe wrote:

...

On 12/01/2011 18:38, Les Mikesell wrote:

...
I have a 750Gb 3-member software raid1 where 2 partitions are always present and the third is regularly rotated and re-synced (SATA disks in hot-swap bays). The timing of the resync seems to be extremely variable recently, taking anywhere from 3 to 10 hours even if the partition is unmounted and the drives aren't doing anything else and regardless of what I echo into /proc/sys/dev/raid/speed_limit_min or _max. Is there some way to tell if the drives are going bad or speed it up consistently if they aren't?

Ummmm.... RAID is not a backup policy. Fine use hotswap and rsync or similar but you really shouldn't be relying on raid rebuilds for that.

These are backups to begin with. It is the archive disk for backuppc which contains millions of hardlinked files for the de-duplication it uses. It would probably take a week or more for rsync to complete a copy (I've tried, but could never leave it offline long enough for any file-oriented copy approach to finish). A raid sync should complete in the time it takes for one pass across the disk and does not require unmounting except momentarily while failing the member out. I don't see why this is not a good use of raid rebuilds (the offsite members are rotated so the one being rebuilt is not the only copy). It used to always complete in 2 or 3 hours. I would just like to know why it sometimes takes much longer.

A related issue is that I'd like to use a laptop sized drive for the offsite copy and have one that has the same capacity. However it uses 4k sectors and runs about 10x slower even though read speed is the same. I understand that using a newer kernel and offsetting the start of the partition to a 4k boundary should fix this but I obviously can't do that with the existing raid. Does anyone have experience with the new drives with 4k sectors on Centos, or know if it will be addressed in Centos6?

-- Les Mikesell lesmikesell@gmail.com

Kevin Thorpe

4:23 p.m.

...

...
Ummmm.... RAID is not a backup policy. Fine use hotswap and rsync or similar but you really shouldn't be relying on raid rebuilds for that.

These are backups to begin with. It is the archive disk for backuppc which contains millions of hardlinked files for the de-duplication it uses. It would

Ouch! I understand now. Backuppc does leave an unholy mess of the filesystem. Using RAID isn't a good solution but I guess is almost the only one that works.

Les Mikesell

4:41 p.m.

On 1/13/2011 10:23 AM, Kevin Thorpe wrote:

...

...
...
Ummmm.... RAID is not a backup policy. Fine use hotswap and rsync or similar but you really shouldn't be relying on raid rebuilds for that.

These are backups to begin with. It is the archive disk for backuppc which contains millions of hardlinked files for the de-duplication it uses. It would

Ouch! I understand now. Backuppc does leave an unholy mess of the filesystem.

It uses the filesystem as designed... It's the file-oriented tools to copy/reproduce hardlinks that don't scale well.

...

Using RAID isn't a good solution but I guess is almost the only one that works.

I suppose it could be done with lvm snapshots or unmounting and using dd or partimage, but being able to reconstruct on the fly is one of the major features or RAID and something needed to make it useful at all. Do you have some specific reason to say it isn't a good solution or are you just repeating advice that raid shouldn't be your only backup?

-- Les Mikesell lesmikesell@gmail.com

Brian Mathis

10:55 p.m.

On Thu, Jan 13, 2011 at 8:56 AM, Les Mikesell lesmikesell@gmail.com wrote:

...

On 1/13/11 5:58 AM, Kevin Thorpe wrote:

...
On 12/01/2011 18:38, Les Mikesell wrote:

...
I have a 750Gb 3-member software raid1 where 2 partitions are always present and the third is regularly rotated and re-synced (SATA disks in hot-swap bays). The timing of the resync seems to be extremely variable recently, taking anywhere from 3 to 10 hours even if the partition is unmounted and the drives aren't doing anything else and regardless of what I echo into /proc/sys/dev/raid/speed_limit_min or _max. Is there some way to tell if the drives are going bad or speed it up consistently if they aren't?

Ummmm.... RAID is not a backup policy. Fine use hotswap and rsync or similar but you really shouldn't be relying on raid rebuilds for that.

These are backups to begin with. It is the archive disk for backuppc which contains millions of hardlinked files for the de-duplication it uses. It would probably take a week or more for rsync to complete a copy (I've tried, but could never leave it offline long enough for any file-oriented copy approach to finish). A raid sync should complete in the time it takes for one pass across the disk and does not require unmounting except momentarily while failing the member out. I don't see why this is not a good use of raid rebuilds (the offsite members are rotated so the one being rebuilt is not the only copy). It used to always complete in 2 or 3 hours. I would just like to know why it sometimes takes much longer.

A related issue is that I'd like to use a laptop sized drive for the offsite copy and have one that has the same capacity. However it uses 4k sectors and runs about 10x slower even though read speed is the same. I understand that using a newer kernel and offsetting the start of the partition to a 4k boundary should fix this but I obviously can't do that with the existing raid. Does anyone have experience with the new drives with 4k sectors on Centos, or know if it will be addressed in Centos6?

-- Les Mikesell lesmikesell@gmail.com

I experimented with using disk image files mounted through loopback for the BackupPC store. It seemed to work but I didn't get much further than testing. It might be a way to use backuppc, and also be able to file-copy the backups by copying the container file after it's been unmounted. If it's big, it won't be fast, but faster than waiting for rsync to work out all the hard links.

5473

Age (days ago)

5474

Last active (days ago)

discuss@lists.centos.org

6 comments

4 participants

tags (0)

participants (4)

Brian Mathis
JohnS
Kevin Thorpe
Les Mikesell