I have a large disk full of data that I'd like to upgrade to SW RAID 1 with a minimum of downtime. Taking it offline for a day or more to rsync all the files over is a non-starter. Since I've mounted SW RAID1 drives directly with "mount -t ext3 /dev/sdX" it would seem possible to flip the process around, perhaps change the partition type with fdisk or parted, and remount as SW RAID1?
I'm not trying to move over the O/S, just a data paritition with LOTS of data. So far, Google pounding has resulted in howtos like this one that's otherwise quite useful, but has a big "copy all your data over" step I'd like to skip:
http://sysadmin.compxtreme.ro/how-to-migrate-a-single-disk-linux-system-to-s...
But it would seem to me that a sequence roughly like this should work without having to recopy all the files.
1) umount /var/data; 2) parted /dev/sdX (change type to fd - Linux RAID auto) 3) Set some volume parameters so it's seen as a RAID1 partition "Degraded". (parted?) 4) ??? Insert mdadm magic here ??? 5) Profit! `mount /dev/md1 /var/data`
Wondering if anybody has done anything like this before...
-Ben
On Thu, Jul 24, 2014 at 7:11 PM, Lists lists@benjamindsmith.com wrote:
I have a large disk full of data that I'd like to upgrade to SW RAID 1 with a minimum of downtime. Taking it offline for a day or more to rsync all the files over is a non-starter. Since I've mounted SW RAID1 drives directly with "mount -t ext3 /dev/sdX" it would seem possible to flip the process around, perhaps change the partition type with fdisk or parted, and remount as SW RAID1?
I'm not trying to move over the O/S, just a data paritition with LOTS of data. So far, Google pounding has resulted in howtos like this one that's otherwise quite useful, but has a big "copy all your data over" step I'd like to skip:
http://sysadmin.compxtreme.ro/how-to-migrate-a-single-disk-linux-system-to-s...
But it would seem to me that a sequence roughly like this should work without having to recopy all the files.
- umount /var/data;
- parted /dev/sdX (change type to fd - Linux RAID auto)
- Set some volume parameters so it's seen as a RAID1 partition
"Degraded". (parted?) 4) ??? Insert mdadm magic here ??? 5) Profit! `mount /dev/md1 /var/data`
Wondering if anybody has done anything like this before...
Even if I found the magic place to change to make the drive think it was a raid member, I don't think I would trust getting it right with my only copy of the data. Note that you don't really have to be offline for the full duration of an rysnc to copy it. You can add another drive as a raid with a 'missing' member, mount it somewhere and rsync with the system live to get most of the data over. Then you can shut down all the applications that might be changing data for another rsync pass to pick up any changes - and that one should be fast. Then move the raid to the real mount point and either (safer) swap a new disk, keeping the old one as a backup or (more dangerous) change the partition type on the original and add it into the raid set and let the data sync up.
On 07/24/2014 06:07 PM, Les Mikesell wrote:
On Thu, Jul 24, 2014 at 7:11 PM, Lists lists@benjamindsmith.com wrote:
I have a large disk full of data that I'd like to upgrade to SW RAID 1 with a minimum of downtime. Taking it offline for a day or more to rsync all the files over is a non-starter. Since I've mounted SW RAID1 drives directly with "mount -t ext3 /dev/sdX" it would seem possible to flip the process around, perhaps change the partition type with fdisk or parted, and remount as SW RAID1?
I'm not trying to move over the O/S, just a data paritition with LOTS of data. So far, Google pounding has resulted in howtos like this one that's otherwise quite useful, but has a big "copy all your data over" step I'd like to skip:
http://sysadmin.compxtreme.ro/how-to-migrate-a-single-disk-linux-system-to-s...
But it would seem to me that a sequence roughly like this should work without having to recopy all the files.
- umount /var/data;
- parted /dev/sdX (change type to fd - Linux RAID auto)
- Set some volume parameters so it's seen as a RAID1 partition
"Degraded". (parted?) 4) ??? Insert mdadm magic here ??? 5) Profit! `mount /dev/md1 /var/data`
Wondering if anybody has done anything like this before...
Even if I found the magic place to change to make the drive think it was a raid member, I don't think I would trust getting it right with my only copy of the data. Note that you don't really have to be offline for the full duration of an rysnc to copy it. You can add another drive as a raid with a 'missing' member, mount it somewhere and rsync with the system live to get most of the data over. Then you can shut down all the applications that might be changing data for another rsync pass to pick up any changes - and that one should be fast. Then move the raid to the real mount point and either (safer) swap a new disk, keeping the old one as a backup or (more dangerous) change the partition type on the original and add it into the raid set and let the data sync up.
I would, of course, have backups. And the machine being upgraded is one of several redundant file stores, so the risk is near zero of actual data loss even if it should not work. :)
And I've done what you suggest: rsync "online", take apps offline, rsync, swap, and bring it all back up. But the data set in question is about 100 million small files (PDFs) and even an rsync -van takes a day or more, downtime I'd like to avoid. A sibling data store is running LVM2 so the upgrade without downtime is underway, another sibling is using ZFS which breezed right through the upgrade so fast I wasn't sure it had even worked!
So... is it possible to convert an EXT4 partition to a RAID1 partition without having to copy the files over?
On 07/24/2014 10:16 PM, Lists wrote:
So... is it possible to convert an EXT4 partition to a RAID1 partition without having to copy the files over?
Unless you can figure out some way to move the start of the partition back to make room for the RAID superblock ahead of the existing filesystem, the answer is, "No." The version 1.2 superblock is located 4KB from the start of the device (partition) and is typically 1024 bytes long.
https://raid.wiki.kernel.org/index.php/RAID_superblock_formats
On Fri, Jul 25, 2014 at 8:56 AM, Robert Nichols rnicholsNOSPAM@comcast.net wrote:
On 07/24/2014 10:16 PM, Lists wrote:
So... is it possible to convert an EXT4 partition to a RAID1 partition without having to copy the files over?
Unless you can figure out some way to move the start of the partition back to make room for the RAID superblock ahead of the existing filesystem, the answer is, "No." The version 1.2 superblock is located 4KB from the start of the device (partition) and is typically 1024 bytes long.
https://raid.wiki.kernel.org/index.php/RAID_superblock_formats
What happens if you mount the partition of a raid1 member directly instead of the md device? I've only done that read-only, but it does seen to work.
You can also try this: 1- Convert your ext4 partition to btrfs. 2- Make raid1 with btrfs. With btrfs you can convert a "bare partition" to almost any raid level, with the proper hard disk amount.
So... is it possible to convert an EXT4 partition to a RAID1 partition without having to copy the files over?
On Fri, Jul 25, 2014 at 8:40 PM, Les Mikesell lesmikesell@gmail.com wrote:
On Fri, Jul 25, 2014 at 8:56 AM, Robert Nichols
What happens if you mount the partition of a raid1 member directly instead of the md device? I've only done that read-only, but it does seen to work.
This is the flip side of the OP's use case i.e. you already have a RAID device.and mounting one of it's member.
-- Arun Khan
On 07/25/2014 06:56 AM, Robert Nichols wrote:
Unless you can figure out some way to move the start of the partition back to make room for the RAID superblock ahead of the existing filesystem, the answer is, "No." The version 1.2 superblock is located 4KB from the start of the device (partition) and is typically 1024 bytes long.
https://raid.wiki.kernel.org/index.php/RAID_superblock_formats
Sadly, this is probably the authoritative answer I was hoping not to get. It would seem technically quite feasible to reshuffle the partition a bit to make this happen with a special tool (perhaps offline for a bit - you'd only have to manage something less than a single MB of data) but I'm guessing nobody has "felt the itch" to make such a tool.
On 07/25/2014 08:10 AM, Les Mikesell wrote:
What happens if you mount the partition of a raid1 member directly instead of the md device? I've only done that read-only, but it does seen to work.
As I originally stated, I've done this successfully many times with a command like:
mount -t ext{2,3,4} /dev/sdXY /media/temp -o rw
Recently, it seems that RHEL/CentOS is smart enough to automagically create /dev/mdX when inserting a drive "hot" EG: USB or hot swap SATA, so I haven't had to do this for a while. You can, however do this, which seems to be logically equivalent:
mdadm --manage /dev/mdX --stop; mount -t ext{2,3,4} /dev/sdXY /media/temp -o rw;
-Ben
On Fri, Jul 25, 2014 at 12:32 PM, Benjamin Smith lists@benjamindsmith.com wrote:
On 07/25/2014 06:56 AM, Robert Nichols wrote:
Unless you can figure out some way to move the start of the partition back to make room for the RAID superblock ahead of the existing filesystem, the answer is, "No." The version 1.2 superblock is located 4KB from the start of the device (partition) and is typically 1024 bytes long.
https://raid.wiki.kernel.org/index.php/RAID_superblock_formats
Sadly, this is probably the authoritative answer I was hoping not to get. It would seem technically quite feasible to reshuffle the partition a bit to make this happen with a special tool (perhaps offline for a bit
- you'd only have to manage something less than a single MB of data) but
I'm guessing nobody has "felt the itch" to make such a tool.
On 07/25/2014 08:10 AM, Les Mikesell wrote:
What happens if you mount the partition of a raid1 member directly instead of the md device? I've only done that read-only, but it does seen to work.
As I originally stated, I've done this successfully many times with a command like:
mount -t ext{2,3,4} /dev/sdXY /media/temp -o rw
But if you write to it, can you clobber the raid superblock? That is, is it somehow allocated as used space in the filesystem or is there a difference it the space available on the md and direct partition, or something else?
Is there soome reason that the existing files cannot be accessed while they are being copied to the raid?
On 07/25/2014 12:12 PM, Michael Hennebry wrote:
Is there soome reason that the existing files cannot be accessed while they are being copied to the raid?
Sheer volume. With something in the range of 100,000,000 small files, it takes a good day or two to rsync. This means that getting a consistent image without significant downtime is impossible. I can handle a few minutes, maybe an hour. Much more than that and I have to explore other options. (In this case, it looks like we'll be biting the bullet and switching to ZFS)
-Ben
On Fri, Jul 25, 2014 at 3:08 PM, Benjamin Smith lists@benjamindsmith.com wrote:
On 07/25/2014 12:12 PM, Michael Hennebry wrote:
Is there soome reason that the existing files cannot be accessed while they are being copied to the raid?
Sheer volume. With something in the range of 100,000,000 small files, it takes a good day or two to rsync. This means that getting a consistent image without significant downtime is impossible. I can handle a few minutes, maybe an hour. Much more than that and I have to explore other options. (In this case, it looks like we'll be biting the bullet and switching to ZFS)
Rsync is really pretty good at that, especially the 3.x versions. If you've just done a live rsync (or a few so there won't be much time for changes during the last live run), the final one with the system idle shouldn't take much more time than a 'find' traversing the same tree. If you have space and time to test, I'd time the third pass or so before deciding it won't work (unless even find would take too long).
On 07/25/2014 03:06 PM, Les Mikesell wrote:
On Fri, Jul 25, 2014 at 3:08 PM, Benjamin Smith lists@benjamindsmith.com wrote:
On 07/25/2014 12:12 PM, Michael Hennebry wrote:
Is there soome reason that the existing files cannot be accessed while they are being copied to the raid?
Sheer volume. With something in the range of 100,000,000 small files, it takes a good day or two to rsync. This means that getting a consistent image without significant downtime is impossible. I can handle a few minutes, maybe an hour. Much more than that and I have to explore other options. (In this case, it looks like we'll be biting the bullet and switching to ZFS)
Rsync is really pretty good at that, especially the 3.x versions. If you've just done a live rsync (or a few so there won't be much time for changes during the last live run), the final one with the system idle shouldn't take much more time than a 'find' traversing the same tree. If you have space and time to test, I'd time the third pass or so before deciding it won't work (unless even find would take too long).
Thanks for your feedback - it's advice I would have given myself just a few years ago. We have *literally* in the range of one hundred million small PDF documents. The simple command
find /path/to/data > /dev/null
takes between 1 and 2 days, system load depending. We had to give up on rsync for backups in this context a while ago - we just couldn't get a "daily" backup more often then about 2x per week. Now we're using ZFS + send/receive to get daily backup times down into the "sub 60 minutes" range, and I'm just going to bite the bullet and synchronize everything at the application level over the next week.
Was just looking for a shortcut...
On 07/25/2014 03:33 PM, Benjamin Smith wrote:
takes between 1 and 2 days, system load depending. We had to give up on rsync for backups in this context a while ago - we just couldn't get a "daily" backup more often then about 2x per week. Now we're using ZFS + send/receive to get daily backup times down into the "sub 60 minutes" range, and I'm just going to bite the bullet and synchronize everything at the application level over the next week. Was just looking for a shortcut...
Here is an evil thought. Is this possible for you do?
1) Setup a method to obtain a RW lock for updates on the original filesystem
2) Use rsync to create a gross copy of the original (yes, it will be slightly out of phase, but stick with me for a bit) on the new filesystem on top of LVM2 on top of a RAID1 volume to make the next step much more efficient.
3) Perform the following loop: a) Set the updates lock on original filesystem b) rsync a *subset* sub-directory of the original filesystem such that you can complete it in, at worst, only a second or two c) Rename the original directory to some safe alternative (safety first)... d) Put a symlink in place of the original directory pointing to the newly synced file system sub-directory e) Release the mutex lock f) Repeat a-e until done
4) Switch over operations to the new filesystem
Another approach would be to leverage something like UnionFS (see http://en.wikipedia.org/wiki/UnionFS ) to allow you to both use the filesystem *and* automatically propagate all updates to the new volume during the migration.
- Jerry Franz
On 07/26/2014 07:04 AM, Jerry Franz wrote:
On 07/25/2014 03:33 PM, Benjamin Smith wrote:
takes between 1 and 2 days, system load depending. We had to give up on rsync for backups in this context a while ago - we just couldn't get a "daily" backup more often then about 2x per week. Now we're using ZFS + send/receive to get daily backup times down into the "sub 60 minutes" range, and I'm just going to bite the bullet and synchronize everything at the application level over the next week. Was just looking for a shortcut...
Here is an evil thought. Is this possible for you do?
Setup a method to obtain a RW lock for updates on the original filesystem
Use rsync to create a gross copy of the original (yes, it will be
slightly out of phase, but stick with me for a bit) on the new filesystem on top of LVM2 on top of a RAID1 volume to make the next step much more efficient.
- Perform the following loop: a) Set the updates lock on original filesystem b) rsync a *subset* sub-directory of the original filesystem such
that you can complete it in, at worst, only a second or two c) Rename the original directory to some safe alternative (safety first)... d) Put a symlink in place of the original directory pointing to the newly synced file system sub-directory e) Release the mutex lock f) Repeat a-e until done
- Switch over operations to the new filesystem
That's essentially what we do to re-sync our production file stores. Once I move to ZFS, though it won't be an issue.
From: Benjamin Smith lists@benjamindsmith.com
Thanks for your feedback - it's advice I would have given myself just a few years ago. We have *literally* in the range of one hundred million small PDF documents. The simple command
find /path/to/data > /dev/null
takes between 1 and 2 days, system load depending. We had to give up on rsync for backups in this context a while ago - we just couldn't get a "daily" backup more often then about 2x per week.
What about: 1. Setup inotify (no idea how it would behave with your millions of files)
2. One big rsync 3. Bring it down and copy the few modified files reported by inotify.
Or lsyncd?
JD
How about something like this: Use find to process each file with a script that does something like this: if foo not soft link : if foo open for output (lsof?) : add foo to todo list else : make foo read-only if foo open for output : add foo to todo list restore foo's permissions else : copy foo to raid replace original with a soft link into raid give copy correct permissions
move the todo list to where it will not be written the script process the todo list files with the same script making a new todo list rinse and repeat until the todo list is empty
for the endgame, make the entire source read-only run find again, this time there is no need for most of the tests: if foo not soft link : copy foo to raid replace original with a soft link into raid give copy correct permissions
after the last copy, make the entire source unreadable and unwriteable wait for last user to close file rename the old files' top directory rename the raid's top directory let users back in
rsync breaks silently or sometimes noisily on big directory/file structures. It depends on how the OP's files are distributed. We organised our files in a client/year/month/day and run a number of rsyncs on separate parts of the hierarchy. Older stuff doesn't need to be rsynced but gets backed up every so often.
But it depends whether or not the OP's data is arranged so that he could do something like that.
Cheers,
Cliff
On Tue, Jul 29, 2014 at 1:25 AM, John Doe jdmls@yahoo.com wrote:
From: Benjamin Smith lists@benjamindsmith.com
Thanks for your feedback - it's advice I would have given myself just a few years ago. We have *literally* in the range of one hundred million small PDF documents. The simple command
find /path/to/data > /dev/null
takes between 1 and 2 days, system load depending. We had to give up on rsync for backups in this context a while ago - we just couldn't get a "daily" backup more often then about 2x per week.
What about:
Setup inotify (no idea how it would behave with your millions of files)
One big rsync
Bring it down and copy the few modified files reported by inotify.
Or lsyncd?
JD
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Tue, Jul 29, 2014 at 1:25 AM, John Doe jdmls@yahoo.com wrote:
From: Benjamin Smith lists@benjamindsmith.com
Thanks for your feedback - it's advice I would have given myself just a few years ago. We have *literally* in the range of one hundred million small PDF documents. The simple command
find /path/to/data > /dev/null
takes between 1 and 2 days, system load depending. We had to give up on rsync for backups in this context a while ago - we just couldn't get a "daily" backup more often then about 2x per week.
What about:
- Setup inotify (no idea how it would behave with your millions of
files)
- One big rsync
- Bring it down and copy the few modified files reported by inotify.
Or lsyncd?
On Tue, Jul 29, 2014 at 12:02 PM, Cliff Pratt enkiduonthenet@gmail.com wrote:
rsync breaks silently or sometimes noisily on big directory/file structures. It depends on how the OP's files are distributed. We organised our files in a client/year/month/day and run a number of rsyncs on separate parts of the hierarchy. Older stuff doesn't need to be rsynced but gets backed up every so often.
But it depends whether or not the OP's data is arranged so that he could do something like that.
Cheers,
Cliff
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On 07/28/2014 05:02 PM, Cliff Pratt wrote:
- Setup inotify (no idea how it would behave with your millions of files)
- One big rsync
- Bring it down and copy the few modified files reported by inotify.
Or lsyncd?
lsyncd is interesting, but for our use case isn't nearly as efficient as ZFS with send/receive. For one case, lsyncd is only useful after the first rsync (which in this case is days) so we would effectively start out with an out-of-sync system and then have to deal with millions of follow up syncs as the monstrous number of queued up inotify events get handled.
I'm moving ahead with the rsync-a-few-directories-at-a-time method Jerry Franz put forward, as it's fundamentally compatible with our setup. (Our "Subdir" is called a "client" - we have hundreds - and we do that during off-hours that clients' business is closed, so nobody notices) But It takes a week or two to fully resync a file store... in the meantime we're at N+1 redundancy instead of N+2 as usual.
need to cancel subscription
2014-07-28 9:25 GMT-04:00, John Doe jdmls@yahoo.com:
From: Benjamin Smith lists@benjamindsmith.com
Thanks for your feedback - it's advice I would have given myself just a few years ago. We have *literally* in the range of one hundred million small PDF documents. The simple command
find /path/to/data > /dev/null
takes between 1 and 2 days, system load depending. We had to give up on rsync for backups in this context a while ago - we just couldn't get a "daily" backup more often then about 2x per week.
What about:
Setup inotify (no idea how it would behave with your millions of files)
One big rsync
Bring it down and copy the few modified files reported by inotify.
Or lsyncd?
JD
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On 7/29/2014 11:48 AM, Juan De Mola wrote:
need to cancel subscription
..... _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
right there, that URL on every message.
On 07/25/2014 12:32 PM, Benjamin Smith wrote:
On 07/25/2014 06:56 AM, Robert Nichols wrote:
Unless you can figure out some way to move the start of the partition back to make room for the RAID superblock ahead of the existing filesystem, the answer is, "No." The version 1.2 superblock is located 4KB from the start of the device (partition) and is typically 1024 bytes long.
https://raid.wiki.kernel.org/index.php/RAID_superblock_formats
Sadly, this is probably the authoritative answer I was hoping not to get. It would seem technically quite feasible to reshuffle the partition a bit to make this happen with a special tool (perhaps offline for a bit
- you'd only have to manage something less than a single MB of data) but
I'm guessing nobody has "felt the itch" to make such a tool.
In thinking about this some more, I had an idea that (a) I'm not totally sure would work, and (b) strikes me as dangerous.
1. Use dmsetup to create a logical device that consists of an 8 KiB prefix followed by your existing partition with the ext4 filesystem.
2. Create your RAID1 array using the above logical device as the first member and with the second member missing.
3. Unmount the current filesystem and mount the RAID device in its place.
4. Add a new device to the (currently degraded) RAID array, and let the RAID system spend the next couple of days recovering data onto the new device.
Eventually, you would remove the dmsetup device from the RAID array and add a new device in its place.
I have a feeling you will not want to risk your data to the above procedure. ;-) Trying to reboot a system with that cobbled together RAID member might prove an interesting exercise.
On Fri, Jul 25, 2014 at 5:41 AM, Lists lists@benjamindsmith.com wrote:
I have a large disk full of data that I'd like to upgrade to SW RAID 1 with a minimum of downtime. Taking it offline for a day or more to rsync all the files over is a non-starter. Since I've mounted SW RAID1 drives directly with "mount -t ext3 /dev/sdX" it would seem possible to flip the process around, perhaps change the partition type with fdisk or parted, and remount as SW RAID1?
I'm not trying to move over the O/S, just a data paritition with LOTS of data. So far, Google pounding has resulted in howtos like this one that's otherwise quite useful, but has a big "copy all your data over" step I'd like to skip:
http://sysadmin.compxtreme.ro/how-to-migrate-a-single-disk-linux-system-to-s...
For data partitions a lot of the stuff is not applicable.
With respect to the madam steps, creating degraded arrays, filesystem on those degraded arrays and then copy over the data etc. is spot on IMO.
I would recommend the steps in the above tutorial to really be assured that none of data is corrupted.
But it would seem to me that a sequence roughly like this should work without having to recopy all the files.
- umount /var/data;
- parted /dev/sdX (change type to fd - Linux RAID auto)
- Set some volume parameters so it's seen as a RAID1 partition
"Degraded". (parted?) 4) ??? Insert mdadm magic here ??? 5) Profit! `mount /dev/md1 /var/data`
Wondering if anybody has done anything like this before...
'mdadm' starts initializing the array (writing on the disk), overwriting your file system on that partition.
I would not recommend it but you can try it and see what happens with your experiment. Should be a no brainer since you have secondary back ups of the data elsewhere (stated in this thread).
-- Arun Khan