Hi all,
I'm writing a script that uses rsync to sync 2 dirs on C7. I noticed a strange behaviour.
I have 2 dir: src and dest. In src dir I generate a testfile with "dd if=/dev/zero of=testfile bs=1M count=100" and when I run "du -h testfile" I get the correct result. Then I sync src/ to dest/ using "rsync -avS src/ dest/", all ok but when I run "du -h dest/testfile" I get 0 and if I run "du -b dest/testfile" I get the correct size in bytes.
I made several test to see what happens and noticed that removing -S (--sparse) from rsync command problem does not occour. In another test, thinking about a problem on 0ed file, I tried file generation using /dev/urandom then /dev/zero and running rsync -avS the problem is disappeared.
This not seems to be a CentOS 7 related problem. I tried also on Fedora 31 and get the same problem.
I wrote a simple bash script to replicate the problem:
#!/bin/bash mkdir src mkdir dest cd src #dd if=/dev/urandom of=testfile bs=1M count=100 dd if=/dev/zero of=testfile bs=1M count=100 echo "src/testfile size:" du -h testfile du -b testfile cd .. rsync -avS src/ dest/ > /dev/null echo "dest/testfile size after rsync --sparse:" du -h dest/testfile du -b dest/testfile rm -f dest/testfile echo "dest/testfile size after rsync:" rsync -av src/ dest/ > /dev/null du -h dest/testfile du -b dest/testfile
There is a bug in rsync or in du or something else?
Thanks in advance.
On Wed, Jan 15, 2020 at 10:18 AM Alessandro Baggi < alessandro.baggi@gmail.com> wrote:
I made several test to see what happens and noticed that removing -S (--sparse) from rsync command problem does not occour. In another test, thinking about a problem on 0ed file, I tried file generation using /dev/urandom then /dev/zero and running rsync -avS the problem is disappeared.
https://wiki.archlinux.org/index.php/Sparse_file#Creating_sparse_files
In short, rsync is being told to create sparse files with the -S flag, so it does. Could you share what you did with the urandom then zero test you mentioned? I'm curious what exact sequence of commands you used.
Il 15/01/20 17:51, Jon Pruente ha scritto:
On Wed, Jan 15, 2020 at 10:18 AM Alessandro Baggi < alessandro.baggi@gmail.com> wrote:
I made several test to see what happens and noticed that removing -S (--sparse) from rsync command problem does not occour. In another test, thinking about a problem on 0ed file, I tried file generation using /dev/urandom then /dev/zero and running rsync -avS the problem is disappeared.
https://wiki.archlinux.org/index.php/Sparse_file#Creating_sparse_files
In short, rsync is being told to create sparse files with the -S flag, so it does. Could you share what you did with the urandom then zero test you mentioned? I'm curious what exact sequence of commands you used. _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Hi Jon, I wrote in the first mail the script with the current order of command that I used. Try to run in a bash script and you will see the result.
If not my sequence is:
dd if=/dev/zero of=src/testfile bs=1M count=100 rsync -avS src/ dest/ du -h dest/testfile du -b dest/testfile
for urandom:
dd if=/dev/urandom of=src/testfile bs=1M count=100 rsync -avS src/ dest/ du -h dest/testfile du -b dest/testfile
without --sparse the same as first sequence without -S option.
But why du reports 0M when with -b reports correct bytes and why this happens only with zeroed file?
I don't know if in the original post mail script
On Wed, Jan 15, 2020 at 11:38 AM Alessandro Baggi < alessandro.baggi@gmail.com> wrote:
Hi Jon, I wrote in the first mail the script with the current order of command that I used. Try to run in a bash script and you will see the result.
If not my sequence is:
dd if=/dev/zero of=src/testfile bs=1M count=100 rsync -avS src/ dest/ du -h dest/testfile du -b dest/testfile
for urandom:
dd if=/dev/urandom of=src/testfile bs=1M count=100 rsync -avS src/ dest/ du -h dest/testfile du -b dest/testfile
without --sparse the same as first sequence without -S option.
But why du reports 0M when with -b reports correct bytes and why this happens only with zeroed file?
Ah, I misunderstood what you meant. I had thought you might have created a file with urandom first and then overwrote it with zeros. This is behaving as expected with sparse files. You can create a sparse file with dd by using seek: https://www.thegeekdiary.com/how-to-create-sparse-files-in-linux-using-dd-co...
Il 15/01/20 18:54, Jon Pruente ha scritto:
On Wed, Jan 15, 2020 at 11:38 AM Alessandro Baggi < alessandro.baggi@gmail.com> wrote:
Hi Jon, I wrote in the first mail the script with the current order of command that I used. Try to run in a bash script and you will see the result.
If not my sequence is:
dd if=/dev/zero of=src/testfile bs=1M count=100 rsync -avS src/ dest/ du -h dest/testfile du -b dest/testfile
for urandom:
dd if=/dev/urandom of=src/testfile bs=1M count=100 rsync -avS src/ dest/ du -h dest/testfile du -b dest/testfile
without --sparse the same as first sequence without -S option.
But why du reports 0M when with -b reports correct bytes and why this happens only with zeroed file?
Ah, I misunderstood what you meant. I had thought you might have created a file with urandom first and then overwrote it with zeros. This is behaving as expected with sparse files. You can create a sparse file with dd by using seek: https://www.thegeekdiary.com/how-to-create-sparse-files-in-linux-using-dd-co... _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Thank you for the suggestion. I meant -S of rsync to use disk space efficiently but this is a (great) misunderstood. Now I read again rsync man page (and your link) and this means "treat sparse files efficiently to save space on disk". My question is: is rsync capable to detect sparse files from "regular" files? If -S is invoked and no sparse files are not in dataset, it treats those files as sparse files or "regular" files?
Why I get different behaviour using urandom and /dev/zero? This is casual/accidental?
Thank you again for your help.
On 1/15/20 8:18 AM, Alessandro Baggi wrote:
Then I sync src/ to dest/ using "rsync -avS src/ dest/", all ok but when I run "du -h dest/testfile" I get 0 and if I run "du -b dest/testfile" I get the correct size in bytes.
That's not a bug, that's what sparse files are.
In POSIX systems, it's possible to treat a regular file like memory, and one of the things you might do with such a feature is use a file to keep track of the last time a user logged in. The simplest way to so that is to save the time value at the offset of the user's UID. My uid is 556600005, so if the file weren't sparse, that one entry would create an enormous file, but with sparse files, the system only needs to allocate one block to store that value. If a process reads that file, it will get all zeros from the OS until it reaches the date stored at my uid offset.
Applications can't tell whether a given set of zeros in a file are actual zeros on disk, or if they're simply parts of the file that haven't been written to, so when you tell rsync to create sparse files, it will do its best to identify blocks that are all zeros and simply not write to those on the destination. Thus, if you use /dev/zero to create a file on the source and then rsync it with -S, the destination file will use zero blocks of storage. Naturally, that can only be true with files whose contents are null bytes, as you get from /dev/zero.
Il 16/01/20 02:21, Gordon Messmer ha scritto:
On 1/15/20 8:18 AM, Alessandro Baggi wrote:
Then I sync src/ to dest/ using "rsync -avS src/ dest/", all ok but when I run "du -h dest/testfile" I get 0 and if I run "du -b dest/testfile" I get the correct size in bytes.
That's not a bug, that's what sparse files are.
In POSIX systems, it's possible to treat a regular file like memory, and one of the things you might do with such a feature is use a file to keep track of the last time a user logged in. The simplest way to so that is to save the time value at the offset of the user's UID. My uid is 556600005, so if the file weren't sparse, that one entry would create an enormous file, but with sparse files, the system only needs to allocate one block to store that value. If a process reads that file, it will get all zeros from the OS until it reaches the date stored at my uid offset.
Applications can't tell whether a given set of zeros in a file are actual zeros on disk, or if they're simply parts of the file that haven't been written to, so when you tell rsync to create sparse files, it will do its best to identify blocks that are all zeros and simply not write to those on the destination. Thus, if you use /dev/zero to create a file on the source and then rsync it with -S, the destination file will use zero blocks of storage. Naturally, that can only be true with files whose contents are null bytes, as you get from /dev/zero.
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Thank you for your answer. I appreciated it.