On 21/08/2019 20:34, Stephen John Smoogen wrote:
On Wed, 21 Aug 2019 at 19:26, João Carlos Mendes Luís <jonny@corp.globo.com mailto:jonny@corp.globo.com> wrote:
Hi, I'm preping a new backend for our mirror host, and just found that centos mirror could use a little help from hardlinking. After running `hardlink -cvvn` on our copy of centos repo, I got these results: *Directories 774** **Objects 220535** **IFREG 219740** **Comparisons 4839** **Would link 903** **Would save 2951557120* This means that 903 files are exactly equal (ignoring metadata, like date, perms, etc), meaning that more than 2.9GB could be saved. Hardly much in a 207GB repo, but a save anyway. Also, this means that local file system cache would be optmized.
It might be but it also depends on what the files are. Could you give exactly what files are doing this.. it may be that the other data is very important for some reason and a hardlink won't be possible.
From these 903 files, 859 are drpms, 1 rpm (storhaug-nfs-1.0-1.el7.noarch.rpm), 10 are RPM-GPG-KEYs, 2 are html (header and notes), 1 GPL, some isolinux config files and many repodata files (contrib, cr, extras).
Some examples:
*centos/6.10/centosplus/x86_64/drpms/kernel-firmware-2.6.32-696.30.1.el6.centos.plus_2.6.32-754.6.3.el6.centos.plus.noarch.drpm** **centos/6.10/centosplus/i386/drpms/kernel-firmware-2.6.32-696.30.1.el6.centos.plus_2.6.32-754.6.3.el6.centos.plus.noarch.drpm*
*centos/7.6.1810/storage/x86_64/gluster-4.1/storhaug-nfs-1.0-1.el7.noarch.rpm* *centos/7.6.1810/storage/x86_64/gluster-4.0/storhaug-nfs-1.0-1.el7.noarch.rpm*
*centos/RPM-GPG-KEY-CentOS-Testing-7* *centos/7.6.1810/os/x86_64/RPM-GPG-KEY-CentOS-Testing-7*
*centos/6.10/os/x86_64/isolinux/boot.msg **centos/7.6.1810/os/x86_64/isolinux/boot.msg*
*centos/6.10/cr/x86_64/repodata/dabe2ce5481d23de1f4f52bdcfee0f9af98316c9e0de2ce8123adeefa0dd08b9-primary.xml.gz* *centos/7.6.1810/cr/x86_64/repodata/dabe2ce5481d23de1f4f52bdcfee0f9af98316c9e0de2ce8123adeefa0dd08b9-primary.xml.gz *
You can easily check on your own repo by running `hardlink -cvvn centos`, it will NOT make any change, just compare files to generate list and report.* *
Problem is, everytime I resync my mirror, these hardlinks are lost. So the hardlink shall be done in the master repo. Is there anything that I'm not seeing that prevents this optimization? Regards, Jonny ------------------------------------------------------------------------ globo.com *João Carlos Mendes Luís* *Senior DevOps Engineer* jonny@corp.globo.com <mailto:jonny@corp.globo.com> +55-21-2483-6893 +55-21-99218-1222 _______________________________________________ CentOS-mirror mailing list CentOS-mirror@centos.org <mailto:CentOS-mirror@centos.org> https://lists.centos.org/mailman/listinfo/centos-mirror
-- Stephen J Smoogen.
CentOS-mirror mailing list CentOS-mirror@centos.org https://lists.centos.org/mailman/listinfo/centos-mirror