[CentOS-mirror] hardlink mirror content?

Thu Aug 22 00:19:55 UTC 2019
João Carlos Mendes Luís <jonny at corp.globo.com>

On 21/08/2019 20:34, Stephen John Smoogen wrote:
>
>
> On Wed, 21 Aug 2019 at 19:26, João Carlos Mendes Luís 
> <jonny at corp.globo.com <mailto:jonny at corp.globo.com>> wrote:
>
>     Hi,
>
>         I'm preping a new backend for our mirror host, and just found
>     that centos mirror could use a little help from hardlinking. 
>     After running `hardlink -cvvn` on our copy of centos repo, I got
>     these results:
>
>         *Directories 774**
>         **Objects 220535**
>         **IFREG 219740**
>         **Comparisons 4839**
>         **Would link 903**
>         **Would save 2951557120*
>
>         This means that 903 files are exactly equal (ignoring
>     metadata, like date, perms, etc), meaning that more than 2.9GB
>     could be saved.  Hardly much in a 207GB repo, but a save anyway. 
>     Also, this means that local file system cache would be optmized.
>
>
> It might be but it also depends on what the files are. Could you give 
> exactly what files are doing this.. it may be that the other data is 
> very important for some reason and a hardlink won't be possible.


     From these 903 files, 859 are drpms, 1 rpm 
(storhaug-nfs-1.0-1.el7.noarch.rpm), 10 are RPM-GPG-KEYs, 2 are html 
(header and notes), 1 GPL, some isolinux config files and many repodata 
files (contrib, cr, extras).

     Some examples:

*centos/6.10/centosplus/x86_64/drpms/kernel-firmware-2.6.32-696.30.1.el6.centos.plus_2.6.32-754.6.3.el6.centos.plus.noarch.drpm**
**centos/6.10/centosplus/i386/drpms/kernel-firmware-2.6.32-696.30.1.el6.centos.plus_2.6.32-754.6.3.el6.centos.plus.noarch.drpm*

*centos/7.6.1810/storage/x86_64/gluster-4.1/storhaug-nfs-1.0-1.el7.noarch.rpm*
*centos/7.6.1810/storage/x86_64/gluster-4.0/storhaug-nfs-1.0-1.el7.noarch.rpm*

*centos/RPM-GPG-KEY-CentOS-Testing-7*
*centos/7.6.1810/os/x86_64/RPM-GPG-KEY-CentOS-Testing-7*

*centos/6.10/os/x86_64/isolinux/boot.msg
**centos/7.6.1810/os/x86_64/isolinux/boot.msg*

*centos/6.10/cr/x86_64/repodata/dabe2ce5481d23de1f4f52bdcfee0f9af98316c9e0de2ce8123adeefa0dd08b9-primary.xml.gz*
*centos/7.6.1810/cr/x86_64/repodata/dabe2ce5481d23de1f4f52bdcfee0f9af98316c9e0de2ce8123adeefa0dd08b9-primary.xml.gz
*

     You can easily check on your own repo by running `hardlink -cvvn 
centos`, it will NOT make any change, just compare files to generate 
list and report.*
*
>
>         Problem is, everytime I resync my mirror, these hardlinks are
>     lost.  So the hardlink shall be done in the master repo.
>
>         Is there anything that I'm not seeing that prevents this
>     optimization?
>
>
>     Regards,
>
>             Jonny
>
>
>     ------------------------------------------------------------------------
>     globo.com 	
>     *João Carlos Mendes Luís*
>     *Senior DevOps Engineer*
>     jonny at corp.globo.com <mailto:jonny at corp.globo.com>
>     +55-21-2483-6893
>     +55-21-99218-1222
>
>
>     _______________________________________________
>     CentOS-mirror mailing list
>     CentOS-mirror at centos.org <mailto:CentOS-mirror at centos.org>
>     https://lists.centos.org/mailman/listinfo/centos-mirror
>
>
>
> -- 
> Stephen J Smoogen.
>
>
> _______________________________________________
> CentOS-mirror mailing list
> CentOS-mirror at centos.org
> https://lists.centos.org/mailman/listinfo/centos-mirror
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-mirror/attachments/20190821/036a99d4/attachment-0006.html>