Reindl Harald wrote:
Am 19.01.2013 15:46, schrieb Nicolas Thierry-Mieg:
M. Fioretti wrote:
On Fri, Jan 18, 2013 08:07:40 AM -0500, SilverTip257 wrote:
if you really want to eliminate that data being transferred, I suppose you could do the extra work and rename the directory at the same time on the source and destination. Not ideal in the least.
Not ideal indeed, but I'll probably do it that way next time that some renaming like this happens on very large folders. I assume that after that, I'd also have to launch rsync with the options that says to not consider modification time.
no I don't think you will, since the file modification times won't have changed.
and even if the did - who cares?
- rsync does not transfer unchanged data ever
- rsync will sync the times to them from the sources
- so have nearly zero network traffic
Not true: if you change the modification time on a file, by default rsync will copy the whole file again.
See man rsync: Rsync finds files that need to be transferred using a “quick check” algorithm (by default) that looks for files that have changed in size or in last-modified time.
and yes I've tested this before posting ;-)
to avoid this you need to use --size-only .
On 01/19/2013 10:28 AM, Nicolas Thierry-Mieg wrote:
Not true: if you change the modification time on a file, by default rsync will copy the whole file again
rsync uses an efficient algorithm to compare file contents and transfer only the differences. Reindl was correct. rsync will use very little bandwidth in this case. You can test this by rsyncing a large file from one system to another, "touch"ing the file, and then rsync again. rsync will take a little while to generate checksums of the data to determine what needs to be copied, but will not transfer the entire contents of the file.
If you run rsync with the -v flag, it will report the saved bandwidth as its "speedup". IIRC, this is expressed as the ratio of the size of files which were detected as not matching based on the given criteria (mtime and size by default, but possibly by checksum if given -c) to the size of data that was actually transmitted.
Gordon Messmer wrote:
On 01/19/2013 10:28 AM, Nicolas Thierry-Mieg wrote:
Not true: if you change the modification time on a file, by default rsync will copy the whole file again
rsync uses an efficient algorithm to compare file contents and transfer only the differences. Reindl was correct. rsync will use very little bandwidth in this case. You can test this by rsyncing a large file from one system to another, "touch"ing the file, and then rsync again. rsync will take a little while to generate checksums of the data to determine what needs to be copied, but will not transfer the entire contents of the file.
If you run rsync with the -v flag, it will report the saved bandwidth as its "speedup". IIRC, this is expressed as the ratio of the size of files which were detected as not matching based on the given criteria (mtime and size by default, but possibly by checksum if given -c) to the size of data that was actually transmitted.
agreed, except if both source and dest are local, eg back up to a USB HD. If you test that you'll see the speedup is 1 (ie no speedup).
On Sat, Jan 19, 2013 at 1:53 PM, Gordon Messmer yinyang@eburg.com wrote:
On 01/19/2013 11:31 AM, Nicolas Thierry-Mieg wrote:
agreed, except if both source and dest are local, eg back up to a USB HD. If you test that you'll see the speedup is 1 (ie no speedup).
I actually never realized that. Thanks.
I guess that makes sense - on a local file it would take longer to do the read for comparison followed by writing back merging the changes than to just write the whole thing.
On 01/19/2013 11:31 AM, Nicolas Thierry-Mieg wrote:
agreed, except if both source and dest are local, eg back up to a USB HD. If you test that you'll see the speedup is 1 (ie no speedup)
That makes sense because it would take longer to locally checksum both files and then make a difference based copy than it would take to just do the copy without trying to be clever about it.
On 1/19/2013 1:28 PM, Nicolas Thierry-Mieg wrote:
Reindl Harald wrote:
Am 19.01.2013 15:46, schrieb Nicolas Thierry-Mieg:
M. Fioretti wrote:
On Fri, Jan 18, 2013 08:07:40 AM -0500, SilverTip257 wrote:
if you really want to eliminate that data being transferred, I suppose you could do the extra work and rename the directory at the same time on the source and destination. Not ideal in the least.
Not ideal indeed, but I'll probably do it that way next time that some renaming like this happens on very large folders. I assume that after that, I'd also have to launch rsync with the options that says to not consider modification time.
no I don't think you will, since the file modification times won't have changed.
and even if the did - who cares?
- rsync does not transfer unchanged data ever
- rsync will sync the times to them from the sources
- so have nearly zero network traffic
Not true: if you change the modification time on a file, by default rsync will copy the whole file again.
See man rsync: Rsync finds files that need to be transferred using a “quick check” algorithm (by default) that looks for files that have changed in size or in last-modified time.
and yes I've tested this before posting ;-)
to avoid this you need to use --size-only .
Yet size only is not reliable. If for instance you have a simple text file with the word hellO and someone catches the typo and changes it to hello, the filesize doesn't change as near as I can see. Both show as 6 using ls -al. Unless rsync uses a more granular check of filesize that I am not aware of? If this is the case, then someone could potentially edit a large document fixing numerous simple typos and wind up with the same filesize.
On 01/19/2013 11:21 AM, John Hinton wrote:
Yet size only is not reliable. If for instance you have a simple text file with the word hellO and someone catches the typo and changes it to hello, the filesize doesn't change as near as I can see.
Right. -c is a better option, unless you're trying to minimize disk IO and you know that size alone is always a valid indicator of changes.
On 01/19/2013 01:21 PM, John Hinton wrote:
On 1/19/2013 1:28 PM, Nicolas Thierry-Mieg wrote:
See man rsync: Rsync finds files that need to be transferred using a “quick check” algorithm (by default) that looks for files that have changed in size or in last-modified time.
and yes I've tested this before posting ;-)
to avoid this you need to use --size-only .
Yet size only is not reliable. If for instance you have a simple text file with the word hellO and someone catches the typo and changes it to hello, the filesize doesn't change as near as I can see. Both show as 6 using ls -al. Unless rsync uses a more granular check of filesize that I am not aware of? If this is the case, then someone could potentially edit a large document fixing numerous simple typos and wind up with the same filesize.
And then there is prelink, which changes the contents of files without changing either the size** or the modification time. It's the topic of some very messy code and nasty comments in my backup scripts.
** The very first time a file is prelinked, the size will change. Subsequent prelink runs may change the content, but will not affect the size.