An entire filesystem (~180g) needs to be copied from one local linux machine to another. Since both systems are on the same local subnet, there's no need for encryption.
I've done this sort of thing before a few times in the past in different ways, but wanted to get input from others on what's worked best for them.
One consideration is that the source filesystem contains quite a few hardlinks and symlinks and of course I want to preserve these, and preserve all timestamps and ownerships and permissions as well. Maintaining the integrity of this metadata and the integrity of the files themselves if of course the top priority.
Speed is also a consideration, but having done this before, I find it even more important to have a running progress report or log so I can see how the session is proceeding and approximately how much longer it will be until finished... and too to see if something's hung up.
One other consideration: There isn't much disk space left on the source machine, so creating a tar file, even compressed, isn't an option.
What relevant methods have you been impressed by?
Hi,
Rsync can maintain symlinks, hardlinks and give you a progress report as well; not to mention it can resume interruptions should they occur.
Having said that, even with your space problem, it is possible to use tar to pack files during transfer, on the fly which should be faster than rsync. Just search for "tar over ssh".
-- Sent from the Delta quadrant using Borg technology!
Nux! www.nux.ro
----- Original Message -----
From: "ken" gebser@mousecar.com To: "CentOS mailing list" centos@centos.org Sent: Wednesday, 17 May, 2017 17:03:13 Subject: [CentOS] Best practices for copying lots of files machine-to-machine
An entire filesystem (~180g) needs to be copied from one local linux machine to another. Since both systems are on the same local subnet, there's no need for encryption.
I've done this sort of thing before a few times in the past in different ways, but wanted to get input from others on what's worked best for them.
One consideration is that the source filesystem contains quite a few hardlinks and symlinks and of course I want to preserve these, and preserve all timestamps and ownerships and permissions as well. Maintaining the integrity of this metadata and the integrity of the files themselves if of course the top priority.
Speed is also a consideration, but having done this before, I find it even more important to have a running progress report or log so I can see how the session is proceeding and approximately how much longer it will be until finished... and too to see if something's hung up.
One other consideration: There isn't much disk space left on the source machine, so creating a tar file, even compressed, isn't an option.
What relevant methods have you been impressed by?
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
On 05/17/2017 12:03 PM, ken wrote:
An entire filesystem (~180g) needs to be copied from one local linux machine to another. Since both systems are on the same local subnet, there's no need for encryption.
I've done this sort of thing before a few times in the past in different ways, but wanted to get input from others on what's worked best for them.
One consideration is that the source filesystem contains quite a few hardlinks and symlinks and of course I want to preserve these, and preserve all timestamps and ownerships and permissions as well. Maintaining the integrity of this metadata and the integrity of the files themselves if of course the top priority.
Speed is also a consideration, but having done this before, I find it even more important to have a running progress report or log so I can see how the session is proceeding and approximately how much longer it will be until finished... and too to see if something's hung up.
One other consideration: There isn't much disk space left on the source machine, so creating a tar file, even compressed, isn't an option.
What relevant methods have you been impressed by?
I use rsync for such work. It is good at maintaining hard and sym links and timestamps. It can give you a running progress as well.
One thing I have learned is that crud happens and I loose my local session for some stupid reason or another, thus I often run rsync in a screen shell that I can easily reconnect to.
Rsync seems to be the obvious answer here.
On 17 May 2017 at 18:16, Robert Moskowitz rgm@htt-consult.com wrote:
On 05/17/2017 12:03 PM, ken wrote:
An entire filesystem (~180g) needs to be copied from one local linux machine to another. Since both systems are on the same local subnet, there's no need for encryption.
I've done this sort of thing before a few times in the past in different ways, but wanted to get input from others on what's worked best for them.
One consideration is that the source filesystem contains quite a few hardlinks and symlinks and of course I want to preserve these, and preserve all timestamps and ownerships and permissions as well. Maintaining the integrity of this metadata and the integrity of the files themselves if of course the top priority.
Speed is also a consideration, but having done this before, I find it even more important to have a running progress report or log so I can see how the session is proceeding and approximately how much longer it will be until finished... and too to see if something's hung up.
One other consideration: There isn't much disk space left on the source machine, so creating a tar file, even compressed, isn't an option.
What relevant methods have you been impressed by?
I use rsync for such work. It is good at maintaining hard and sym links and timestamps. It can give you a running progress as well.
One thing I have learned is that crud happens and I loose my local session for some stupid reason or another, thus I often run rsync in a screen shell that I can easily reconnect to.
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
On 5/17/17, 12:03 PM, "CentOS on behalf of ken" <centos-bounces@centos.org on behalf of gebser@mousecar.com> wrote:
An entire filesystem (~180g) needs to be copied from one local linux machine to another. Since both systems are on the same local subnet, there's no need for encryption.
I've done this sort of thing before a few times in the past in different ways, but wanted to get input from others on what's worked best for them.
If shutting the machines down is feasible, I’d put the source hard drive into the destination machine and use rsync to copy it from one drive to the other (rather than using rsync to copy from one machine to the other over the network).
--- Mike VanHorn Senior Computer Systems Administrator College of Engineering and Computer Science Wright State University 265 Russ Engineering Center 937-775-5157 michael.vanhorn@wright.edu
On 05/17/2017 04:31 PM, Vanhorn, Mike wrote:
On 5/17/17, 12:03 PM, "CentOS on behalf of ken" <centos-bounces@centos.org on behalf of gebser@mousecar.com> wrote:
An entire filesystem (~180g) needs to be copied from one local linux machine to another. Since both systems are on the same local subnet, there's no need for encryption.
I've done this sort of thing before a few times in the past in different ways, but wanted to get input from others on what's worked best for them.
If shutting the machines down is feasible, I’d put the source hard drive into the destination machine and use rsync to copy it from one drive to the other (rather than using rsync to copy from one machine to the other over the network).
And then you are assured that all files are closed before copying.
If shutting the machines down is feasible, I’d put the source hard drive into the destination machine and use rsync to copy it from one drive to the other (rather than using rsync to copy from one machine to the other over the network).
I'm not so sure about that. Probably the disk is the bottleneck, not the network. Assuming that one of the drives is a 7.2krpm drive it will have a sequential read / write performance of somewhere around 120MB/s although newer spinning drives do seem a tad faster than that. Running uncompressed Rsync over SSH over a Gbit network I have seen more than 100MB/s however if your files are compressible then you can see much better performance.
If your ethernet network is 100Mbit/s or the source and destination networks are both SSD then yea... what he said :)
Mike VanHorn Senior Computer Systems Administrator College of Engineering and Computer Science Wright State University 265 Russ Engineering Center 937-775-5157 michael.vanhorn@wright.edu
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Vanhorn, Mike wrote:
On 5/17/17, 12:03 PM, "CentOS on behalf of ken" <centos-bounces@centos.org on behalf of gebser@mousecar.com> wrote:
An entire filesystem (~180g) needs to be copied from one local linux machine to another. Since both systems are on the same local subnet, there's no need for encryption.
I've done this sort of thing before a few times in the past in different ways, but wanted to get input from others on what's worked best for them.
If shutting the machines down is feasible, I’d put the source hard drive into the destination machine and use rsync to copy it from one drive to the other (rather than using rsync to copy from one machine to the other over the network).
Why? I just rsync'd 159G in less than one workday from one server to another. Admittedly, we allegedly have a 1G network, but....
mark
On 17 May 2017 at 22:27, m.roth@5-cent.us wrote:
Vanhorn, Mike wrote:
On 5/17/17, 12:03 PM, "CentOS on behalf of ken" <
centos-bounces@centos.org
on behalf of gebser@mousecar.com> wrote:
An entire filesystem (~180g) needs to be copied from one local linux machine to another. Since both systems are on the same local subnet, there's no need for encryption.
I've done this sort of thing before a few times in the past in different ways, but wanted to get input from others on what's worked best for them.
If shutting the machines down is feasible, I’d put the source hard drive into the destination machine and use rsync to copy it from one drive to the other (rather than using rsync to copy from one machine to the other over the network).
Why? I just rsync'd 159G in less than one workday from one server to another. Admittedly, we allegedly have a 1G network, but....
mark
Hi,
you can parallelize rsync with xargs's -P (max-procs) option (man xargs).
rsync -a -f"+ */" -f"- *" source/ server:/destination/ #sync directory first cd source/; find . -type f | *xargs* -n1 -*P0* -I% rsync -az % server:/destination/% # 0 to let xargs deal with the num of procs
Julius
On 5/17/17, 5:27 PM, "CentOS on behalf of m.roth@5-cent.us" <centos-bounces@centos.org on behalf of m.roth@5-cent.us> wrote:
Why? I just rsync'd 159G in less than one workday from one server to another. Admittedly, we allegedly have a 1G network, but....
Well, I’ve don’t recall ever having to rsync more than 100G (although I am doing multiple rsyncs of about 86G as we speak), and I’ve never been able to do it with machines on their own, isolated switch (so my rsync’s are competing with everything else on the network), and it’s been a while since I’ve actually tried it multiple ways and measured it, but in my experience I’ve never see the network outperform the system bus.
--- Mike VanHorn Senior Computer Systems Administrator College of Engineering and Computer Science Wright State University 265 Russ Engineering Center 937-775-5157 michael.vanhorn@wright.edu
Vanhorn, Mike wrote:
On 5/17/17, 5:27 PM, "CentOS on behalf of m.roth@5-cent.us" <centos-bounces@centos.org on behalf of m.roth@5-cent.us> wrote:
Why? I just rsync'd 159G in less than one workday from one server to another. Admittedly, we allegedly have a 1G network, but....
Well, I’ve don’t recall ever having to rsync more than 100G (although I am doing multiple rsyncs of about 86G as we speak), and I’ve never been able to do it with machines on their own, isolated switch (so my rsync’s are competing with everything else on the network), and it’s been a while since I’ve actually tried it multiple ways and measured it, but in my experience I’ve never see the network outperform the system bus.
I wasn't saying the network outperformed the system bus. Most of the time, though, I don't have that as a possibility. Usually, all the drive bays are full and in use.
When we get to terabytes, that's at least overnight; but a few hundred gig I can do in a day, if I start early.
mark