[CentOS-mirror] hardlink mirror content?

Thu Aug 29 18:58:07 UTC 2019
Anssi Johansson <avij at centosproject.org>

Hi everyone. Now that we're on the subject of hard linked files, here's 
my regular reminder to all the mirror admins to use the -H flag when 
rsyncing from msync.centos.org. The -H flag preserves hard links, which 
are already extensively used for CentOS content. One way to verify that 
your hard links are OK is to run "stat 6.10/os/x86_64/images/boot.iso". 
There should be "Links: 2" (or more) in the output, because boot.iso is 
the same file as 6.10/isos/x86_64/CentOS-6.10-x86_64-netinstall.iso. 
Using hard links makes syncs faster and saves hard disk space.

If you just added -H to your rsync command line, rsync will take care of 
deleting the unneeded copies of hard linked files automatically the next 
time you sync.

And now on to João's specific concerns.

We do run "hardlink" regularly on the master server, but we do so 
without the -c flag which "Disregards permission, ownership and other 
differences" [such as modification time].

The repodata files need to preserve their modification times, because 
the timestamp is included in repomd.xml. If hardlink changes the 
modification time of a file mentioned in repomd.xml, it may cause odd 
problems.

The drpms are easier in this regard and yes, it might make sense to run 
hardlink on those because the exact timestamp is less important for 
drpms (as far as I'm aware). I don't have the authority to do so, 
however, so it would need to be someone else.

But the drpm issue may soon be a moot point. There are plans to drop 
drpms altogether [1] and without drpms, there won't be a need to hard 
link them either.


[1] https://lists.centos.org/pipermail/centos-devel/2019-June/017433.html