We have a CentOS 3 server with about 300GB of data on an ext2 filesystem that we need to mirror onto a new drive, which we're then going to pull out and put into a second server. A straight disk-to-disk copy (with rsync, tar, or "cp -a" doesn't much matter) manages about 75MB per minute, which would take almost three days, and the system gets very sluggish while such a copy is going on, so we can't afford to just let it run.
Is it possible, without loss of data, to convert the existing ext2 filesystem into a mirrored software RAID, then add the new drive as a second device and let rebuilding the RAID take care of making the copy? Even if this took more time, we've had good overall system performance with software RAIDs rebuilding in the background before, so it could run as long as necessary. We'd then need to be able to remove the second device from the RAID and either convert it back into a plain ext2 or put it into a similar software RAID in the destination machine.
Is this possible? Is there another plan that would make more sense?
Thanks in advance for suggestions.
.... A straight disk-to-disk copy (with rsync, tar, or "cp -a" doesn't much matter) manages about 75MB per minute, which would take almost three days, and the system gets very sluggish while such a copy is going on, so we can't afford to just let it run.
Is it possible, without loss of data, to convert the existing ext2 filesystem into a mirrored software RAID, then add the new drive as a second device and let rebuilding the RAID take care of making the copy? Even if this took more time, we've had good overall system performance with software RAIDs rebuilding in the background before, so it could run as long as necessary. We'd then need to be able to remove the second device from the RAID and either convert it back into a plain ext2 or put it into a similar software RAID in the destination machine.
Is this possible? Is there another plan that would make more sense?
I'm sure it's possible, and other users here can give you the exact syntax.
However, I'd suggest running rsync to the second server before attempting anything anyway, since if your mirror-add operation fails, well, do you have a backup?
rsync has an option for bandwidth limiting (--bwlimit). You could run it throttled, which should cause it to calm down quite a bit and not thrash your system so much. Also, on subsequent runs, it will run much faster, since it'll only be sending the diffs between the servers.
You may find doing it this way completely negates the need to use the mirror-and-move method -- the initial run could be done over a week or two with bwlimit, and then you could run it again without bwlimit on an already-rsynced path and "clean up" the most recent changes quickly enough as to not need very long of a downtime maintenance window.
best, Jeff
Bart Schaefer wrote:
We have a CentOS 3 server with about 300GB of data on an ext2 filesystem that we need to mirror onto a new drive, which we're then going to pull out and put into a second server. A straight disk-to-disk copy (with rsync, tar, or "cp -a" doesn't much matter) manages about 75MB per minute, which would take almost three days, and the system gets very sluggish while such a copy is going on, so we can't afford to just let it run.
Is it possible, without loss of data, to convert the existing ext2 filesystem into a mirrored software RAID, then add the new drive as a second device and let rebuilding the RAID take care of making the copy? Even if this took more time, we've had good overall system performance with software RAIDs rebuilding in the background before, so it could run as long as necessary. We'd then need to be able to remove the second device from the RAID and either convert it back into a plain ext2 or put it into a similar software RAID in the destination machine.
Is this possible? Is there another plan that would make more sense?
Thanks in advance for suggestions. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Does rsync with the bandwidth limiting features still slow the system unacceptably?
Just a thought,
MrKiwi,
MrKiwi wrote:
Bart Schaefer wrote:
We have a CentOS 3 server with about 300GB of data on an ext2 filesystem that we need to mirror onto a new drive, which we're then going to pull out and put into a second server. A straight disk-to-disk copy (with rsync, tar, or "cp -a" doesn't much matter) manages about 75MB per minute, which would take almost three days, and the system gets very sluggish while such a copy is going on, so we can't afford to just let it run.
Is it possible, without loss of data, to convert the existing ext2 filesystem into a mirrored software RAID, then add the new drive as a second device and let rebuilding the RAID take care of making the copy? Even if this took more time, we've had good overall system performance with software RAIDs rebuilding in the background before, so it could run as long as necessary. We'd then need to be able to remove the second device from the RAID and either convert it back into a plain ext2 or put it into a similar software RAID in the destination machine.
Is this possible? Is there another plan that would make more sense?
Thanks in advance for suggestions. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Does rsync with the bandwidth limiting features still slow the system unacceptably?
Just a thought,
MrKiwi, _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Also - maybe you could add two scheduled scripts; Fri 8pm, "killall rsync; rsync A B" Mon 5am, "killall rsync; rsync --bwlimit A B"
Would that give you fast-as-possible syncing without too much loss of responsiveness?
D
Quoting Bart Schaefer barton.schaefer@gmail.com:
We have a CentOS 3 server with about 300GB of data on an ext2 filesystem that we need to mirror onto a new drive, which we're then going to pull out and put into a second server. A straight disk-to-disk copy (with rsync, tar, or "cp -a" doesn't much matter) manages about 75MB per minute, which would take almost three days, and the system gets very sluggish while such a copy is going on, so we can't afford to just let it run.
Is it possible, without loss of data, to convert the existing ext2 filesystem into a mirrored software RAID, then add the new drive as a second device and let rebuilding the RAID take care of making the copy? Even if this took more time, we've had good overall system performance with software RAIDs rebuilding in the background before, so it could run as long as necessary. We'd then need to be able to remove the second device from the RAID and either convert it back into a plain ext2 or put it into a similar software RAID in the destination machine.
Is this possible? Is there another plan that would make more sense?
Yes, it is possible. It's not too complicated, downtime should be minimal, but if you do any calculation mistakes, you loose your data.
I can't find an document that I used to have describing the process in more detail. So, just some hints:
MD keeps metadata information at the end of partition. Hence, MD device will be a bit smaller than the partition (for the size of metadata chunk). Google around or check kernel docs to find out how much space it takes. If you can't find it, shrik the file system 1 GB, than after you are done with creating mirrors you can expand it back to use all available space (or simply leave at whatever size it was).
Your filesystem currently uses the entire partition. You need to resize it (shrink) so that there's enough space for MD metadata at the end of partition. The file system size must be multiple of file system block size. Use dumpe2fs to find out block size (and current file system size). Use resize2fs to shrink file system (must be done offline). Check man page for resize2fs for details.
If you are using LVM, you'll also need to shrink your logical and physical volumes.
Once the space at the end of partition is not used for anything, you can use mdadm to create RAID-1 MD device. Create it with only one disk (leave the second disk missing). After that, simply attach second drive (again using mdadm) and let it resync. You'll probably want to change partition tag to linux reaid autodetect.
Try this on some spare testing box until you get it right. If you make any errors, you can kiss your data goodbye.
BTW, some popular HOWTO's on the web suggest resizing file system *after* mirror is created. This is dangerous. If there is any allocated blocks in the part of file system at the far end of partition (the space that will be used for MD metadata), you will loose some data or in worst (but not likely) scenario you can end up loosing entire file system. Don't do that. Do it the safe way, shrink file system first, than create mirror.
On Mon, 2006-11-27 at 12:13 -0800, Bart Schaefer wrote:
We have a CentOS 3 server with about 300GB of data on an ext2 filesystem that we need to mirror onto a new drive, which we're then going to pull out and put into a second server. A straight disk-to-disk copy (with rsync, tar, or "cp -a" doesn't much matter) manages about 75MB per minute, which would take almost three days, and the system gets very sluggish while such a copy is going on, so we can't afford to just let it run.
If you can be down a few hours, you should be able to boot the install CD in rescue mode and dd bs=1M if=/dev/hda of=/dev/hdb (being very careful that those are the correct devices for the source and target respectively). I'd expect that to take about 3 hours to complete, depending on your drives and controller.
Is it possible, without loss of data, to convert the existing ext2 filesystem into a mirrored software RAID, then add the new drive as a second device and let rebuilding the RAID take care of making the copy? Even if this took more time, we've had good overall system performance with software RAIDs rebuilding in the background before, so it could run as long as necessary. We'd then need to be able to remove the second device from the RAID and either convert it back into a plain ext2 or put it into a similar software RAID in the destination machine.
Is this possible? Is there another plan that would make more sense?
It's possible, but I wouldn't try it without a backup. Another approach would be to put the drive directly in the 2nd server and do the rsync over the network. This would probably complete over a weekend. If the files are changing, you can do one run with the machine active, then repeat it when no changes are happening. The 2nd run should go very quickly since it only has to copy the differences.
On 11/27/06, Les Mikesell lesmikesell@gmail.com wrote:
If you can be down a few hours, you should be able to boot the install CD in rescue mode and dd bs=1M if=/dev/hda of=/dev/hdb (being very careful that those are the correct devices for the source and target respectively). I'd expect that to take about 3 hours to complete, depending on your drives and controller.
That may be what we have to try. This isn't the root filesystem, so we should just be able to unmount it during the dd and not have to reboot?
Another approach would be to put the drive directly in the 2nd server and do the rsync over the network. This would probably complete over a weekend.
The second server is 2900 miles away in another data center with no dedicated network between the locations, so this probably isn't going to work ...
If the files are changing, you can do one run with the machine active, then repeat it when no changes are happening. The 2nd run should go very quickly since it only has to copy the differences.
Unfortunately we're talking about a relatively small number of relatively huge files, so the checksumming is a significant portion of the rsync time.
Thanks, everyone, for your responses so far.
Çäðàâñòâóéòå, Bart.
Âû ïèñàëè 28 ?????? 2006 ?., 2:04:57:
On 11/27/06, Les Mikesell lesmikesell@gmail.com wrote:
If you can be down a few hours, you should be able to boot the install CD in rescue mode and dd bs=1M if=/dev/hda of=/dev/hdb (being very careful that those are the correct devices for the source and target respectively). I'd expect that to take about 3 hours to complete, depending on your drives and controller.
That may be what we have to try. This isn't the root filesystem, so we should just be able to unmount it during the dd and not have to reboot?
That should work fine. Unmount, copy, mount.
On 11/27/06, Bart Schaefer barton.schaefer@gmail.com wrote:
If you can be down a few hours, you should be able to boot the install CD in rescue mode and dd bs=1M if=/dev/hda of=/dev/hdb (being very careful that those are the correct devices for the source and target respectively). I'd expect that to take about 3 hours to complete, depending on your drives and controller.
That may be what we have to try. This isn't the root filesystem, so we should just be able to unmount it during the dd and not have to reboot?
Yes, you just have to be sure nothing changes during the copy and unmounting will take care of that.
Another approach would be to put the drive directly in the 2nd server and do the rsync over the network. This would probably complete over a weekend.
The second server is 2900 miles away in another data center with no dedicated network between the locations, so this probably isn't going to work ...
You could use some other local machine.
If the files are changing, you can do one run with the machine active, then repeat it when no changes are happening. The 2nd run should go very quickly since it only has to copy the differences.
Unfortunately we're talking about a relatively small number of relatively huge files, so the checksumming is a significant portion of the rsync time.
If you don't use the -i option, rsync will skip over existing files that match in length and timestamps without doing the block checksum compares. Of course if the big files are changing, this won't help much.