[CentOS] Need help to fix bug in rsync

Wed Mar 25 19:17:07 UTC 2020
Leon Fauster <leonfauster at googlemail.com>

Am 25.03.20 um 19:15 schrieb Simon Matter via CentOS:
>> On Wed, 2020-03-25 at 14:39 +0000, Leroy Tennison wrote:
>>> Since you state that using -z is almost always a bad idea, could you
>>> provide the rationale for that?  I must be missing something.
>>>
>> I think the "rationale" is that at some point the
>> compression/decompression takes longer than the time reduction from
>> sending a compressed file.  It depends on the relative speeds of the
>> machines and the network.
>>
>> You have most to gain from compressing large files, but if they are
>> already compressed, then you have nothing to gain from just doing small
>> files.
>>
>> It obviously depends on your network speed and if you have a metered
>> connection, but does anyone really have such an ancient network
>> connection still these days - I mean if you have fast enough machines
>> at both ends to do rapid compression/decompression, it seems unlikely
>> that you will have a damp piece of string connecting them.
> 
> I really don't understand the discussion here. What is wrong with using -z
> with rsync? We're using rsync with -z for backups and just don't want to
> waste bandwidth for nothing. We have better use for our bandwidth and it
> makes quite a difference when backing up terabytes of data.
> 
> The only reason why I asked for help is because we don't want to double
> compress data which is already compressed. This is what currently is
> broken in rsync without manually specifying a skip-compress list. Fixing
> it would help all those who don't know it's broken now.
> 

Until this is fixed; as a workaround I would do a two-pass transfer with
filters via ".rsync-filter" file and then using rsync -azvF for
everything with high compression ratio and rsync -av for all, including
compressed data.
So, ".rsync-filter" includes the exclude statements for compressed
formats. This all makes only sense if the compression ratio is higher
then the meta data transfer of the second run ...

--
Leon