On 11/09/2013 01:07 AM, Anssi Johansson wrote:
One more thing -- if the bandwidth of the msync servers is the bottleneck, we could reduce their bandwidth requirements by serving the CR packages from public mirrors instead of mirror.centos.org. Most of the mirror.centos.org machines are also in the msync.centos.org rotation, thus they share the same bandwidth. I don't have statistics of the bandwidth consumed by CR. Someone who has access to such stats could see if this change would be worth implementing or not.
We certainly need to get smarter about the mirror/msync split - and go back to the older model we used where msync machines that were seeding the mirror network were taken out from being in mirror.centos.org - and we had dedicated box's for some high capacity mirrors ( like kernel.org and heanet.ie and mirrorservice.org ).
Again, this is a great thing to try and PoC and workout a functional setup - it would, ofcourse, need to be automated via puppet and managed mostly via zabbix. If there are manual steps involved, its going to fail and as has happened already, get discarded since its just more stuff to do.
We usually run hardlink over the entire mirror tree, so that should solve a large part of the dupe-content-being-rsync'd again.
Another issue we have and we should try and work around is that the entire tree os/ isos/ repo's/ is moved in at one point - maybe we should spread those a bit, allowing more rsync' runs to finish sooner ( and therefore allow subtree rsync's to start earlier ).
The old speedmatrix code I wrote back in 2010 mostly still works, and gives us a fair idea of what capacity the various msync and mirror.c.o machines have at a given point in time, and doing it a few times through a 24 hr cycle is almost always a true picture of reality.
I can get a bunch of VM's online to trial some of these things, but they will need to be in one play-cloud or something such, does anyone haves ideas on how one might create 'real world network problems' on a bunch of VM's ? eg. how do we cap say 25% of the network interfaces to 10mbps, and how might we generate latency ? I have a bunch of ideas on howto do this between machines, but not so much on VMs running on the same ( or just a few ) machines.
- KB