-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/02/15 14:38, Karanbir Singh wrote:
Hi,
At the end of the Dojo in Brussels, I had the chance to field the question to our contributor audience : how can we get security updates out to the user machines faster.
At the moment, things are setup like any other distro or large open source content network is : we rsync in stages, and external mirrors pickup every 4 to 6 hours, some external mirrors pickup from other external mirrors. Net result is that for a given update, it can be upto 16 to 18 hours before we get a majority content sync in front of most users.
So, we have to split the answer into two categories : * people using default yum repositories provided by mirrorlist.centos.org : I just verified on the node producing all those mirrorlists the needed time to crawl all external mirrors and validate their contents : It takes maximum 3 hours to crawl all external mirrors for {5,6,7}/{os,updates,extras,centosplus}/{i386,x86_64}. So in the worst case scenario, supposing that we dropped a new rpm/metadata just at the start of the crawler process, we'd have to wait 6 hours (so waiting for second run to finish validating and pushing new mirrorlists)
* people not using default yum repositories : nothing we can directly do, as we don't control the repo they are using
In cases like the recent Glibc issue, 18 hrs can be a long time since the release ( remember, we already lag RHEL releases since our process starts once theirs ends ).
There were a couple of ideas that came up in the conversation at the Dojo, and then in the following conversations over the entire Fosdem weekend. The two that seemed most likely, easiest to implement and perhaps most robust, involved a chunk of the load moving to mirror.centos.org for some period of time. These are :
A) we setup a rapid update repo, that would be hosted on and run from mirror.centos.org exclusively. The yum repo definitions would still point at mirrorlist, however they would only expect centos.org urls in the baseurl stack from mirrorlist.centos.org; This would allow us to reduce the overall to-user-visibility in default centos linux installs to under an hour for content upto 250MB in size.
B) integrate the mirrorlist backend with the release mechanism in centos linux, so when there is a new updates pushed, all updates are then delivered via mirror.centos.org for the next 24 hrs. After this period, traffic reshapes to be delivered from the external mirrors by default.
The Key issue to note with (A) is that while we might push something to this rapid update repo, the same content will also be available in the regular updates/ repo. So once its starts showing up externally, traffic will naturally switch to using the updates repo from local mirrors ( using repo names and cost etc, we can influence repo priority where there is common content ).
So, let me add then something between (A) and (B) : as we control also the node producing the mirrorlists, why not having a parallel job on the same host, just crawling "on demand" the updates repo for a specific release when we know that we have to release "critical" updates. We'd then be able to validate directly in loop which mirrors would be validated for that specific package/repodata and so not having to wait multiple hours. At the same time, we can add mirror.centos.org in the mix, but already validated by default.
Reason why I'd not specifically like to see only mirror.centos.org nodes being used is that from time to time, we also lost some of those nodes, just because monthly quota was then used, and or NOC team thinking that a DoS was happening :-)
The second important thing to note here is traffic capacity : we can deliver upto 20 Gbit/sec in the US and EU from centos.org - and around 8 Gbit/sec for everywhere else. Given that we prefer to offload traffic to external mirrors at the moment, its hard to estimate what the overall capacity requirements will be. Ideally, drpms should/would help.
Depending on how everyone feels around this, I'd like to go ahead and kick off implementation around this. Lets get it done before the next big security update comes through.
Regards
- --
Fabian Arrotin The CentOS Project | http://www.centos.org gpg key: 56BEC54E | twitter: @arrfab