setting up an emergency update route

List overview All Threads
Download

newer

older

infrastructure for doing C7 builds?

a mainline kernel in centos repos

Karanbir Singh

3 Feb 2015 3 Feb '15

1:38 p.m.

Hi,

At the end of the Dojo in Brussels, I had the chance to field the question to our contributor audience : how can we get security updates out to the user machines faster.

At the moment, things are setup like any other distro or large open source content network is : we rsync in stages, and external mirrors pickup every 4 to 6 hours, some external mirrors pickup from other external mirrors. Net result is that for a given update, it can be upto 16 to 18 hours before we get a majority content sync in front of most users.

In cases like the recent Glibc issue, 18 hrs can be a long time since the release ( remember, we already lag RHEL releases since our process starts once theirs ends ).

There were a couple of ideas that came up in the conversation at the Dojo, and then in the following conversations over the entire Fosdem weekend. The two that seemed most likely, easiest to implement and perhaps most robust, involved a chunk of the load moving to mirror.centos.org for some period of time. These are :

A) we setup a rapid update repo, that would be hosted on and run from mirror.centos.org exclusively. The yum repo definitions would still point at mirrorlist, however they would only expect centos.org urls in the baseurl stack from mirrorlist.centos.org; This would allow us to reduce the overall to-user-visibility in default centos linux installs to under an hour for content upto 250MB in size.

B) integrate the mirrorlist backend with the release mechanism in centos linux, so when there is a new updates pushed, all updates are then delivered via mirror.centos.org for the next 24 hrs. After this period, traffic reshapes to be delivered from the external mirrors by default.

The Key issue to note with (A) is that while we might push something to this rapid update repo, the same content will also be available in the regular updates/ repo. So once its starts showing up externally, traffic will naturally switch to using the updates repo from local mirrors ( using repo names and cost etc, we can influence repo priority where there is common content ).

The second important thing to note here is traffic capacity : we can deliver upto 20 Gbit/sec in the US and EU from centos.org - and around 8 Gbit/sec for everywhere else. Given that we prefer to offload traffic to external mirrors at the moment, its hard to estimate what the overall capacity requirements will be. Ideally, drpms should/would help.

Depending on how everyone feels around this, I'd like to go ahead and kick off implementation around this. Lets get it done before the next big security update comes through.

Regards

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

Show replies by date

Fabian Arrotin

3 Feb 3 Feb

2:03 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 03/02/15 14:38, Karanbir Singh wrote:

...

Hi,

At the end of the Dojo in Brussels, I had the chance to field the question to our contributor audience : how can we get security updates out to the user machines faster.

At the moment, things are setup like any other distro or large open source content network is : we rsync in stages, and external mirrors pickup every 4 to 6 hours, some external mirrors pickup from other external mirrors. Net result is that for a given update, it can be upto 16 to 18 hours before we get a majority content sync in front of most users.

So, we have to split the answer into two categories : * people using default yum repositories provided by mirrorlist.centos.org : I just verified on the node producing all those mirrorlists the needed time to crawl all external mirrors and validate their contents : It takes maximum 3 hours to crawl all external mirrors for {5,6,7}/{os,updates,extras,centosplus}/{i386,x86_64}. So in the worst case scenario, supposing that we dropped a new rpm/metadata just at the start of the crawler process, we'd have to wait 6 hours (so waiting for second run to finish validating and pushing new mirrorlists)

* people not using default yum repositories : nothing we can directly do, as we don't control the repo they are using

...

In cases like the recent Glibc issue, 18 hrs can be a long time since the release ( remember, we already lag RHEL releases since our process starts once theirs ends ).

There were a couple of ideas that came up in the conversation at the Dojo, and then in the following conversations over the entire Fosdem weekend. The two that seemed most likely, easiest to implement and perhaps most robust, involved a chunk of the load moving to mirror.centos.org for some period of time. These are :

A) we setup a rapid update repo, that would be hosted on and run from mirror.centos.org exclusively. The yum repo definitions would still point at mirrorlist, however they would only expect centos.org urls in the baseurl stack from mirrorlist.centos.org; This would allow us to reduce the overall to-user-visibility in default centos linux installs to under an hour for content upto 250MB in size.

B) integrate the mirrorlist backend with the release mechanism in centos linux, so when there is a new updates pushed, all updates are then delivered via mirror.centos.org for the next 24 hrs. After this period, traffic reshapes to be delivered from the external mirrors by default.

The Key issue to note with (A) is that while we might push something to this rapid update repo, the same content will also be available in the regular updates/ repo. So once its starts showing up externally, traffic will naturally switch to using the updates repo from local mirrors ( using repo names and cost etc, we can influence repo priority where there is common content ).

So, let me add then something between (A) and (B) : as we control also the node producing the mirrorlists, why not having a parallel job on the same host, just crawling "on demand" the updates repo for a specific release when we know that we have to release "critical" updates. We'd then be able to validate directly in loop which mirrors would be validated for that specific package/repodata and so not having to wait multiple hours. At the same time, we can add mirror.centos.org in the mix, but already validated by default.

Reason why I'd not specifically like to see only mirror.centos.org nodes being used is that from time to time, we also lost some of those nodes, just because monthly quota was then used, and or NOC team thinking that a DoS was happening :-)

...

The second important thing to note here is traffic capacity : we can deliver upto 20 Gbit/sec in the US and EU from centos.org - and around 8 Gbit/sec for everywhere else. Given that we prefer to offload traffic to external mirrors at the moment, its hard to estimate what the overall capacity requirements will be. Ideally, drpms should/would help.

Depending on how everyone feels around this, I'd like to go ahead and kick off implementation around this. Lets get it done before the next big security update comes through.

Regards

- --

Fabian Arrotin The CentOS Project | http://www.centos.org gpg key: 56BEC54E | twitter: @arrfab

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlTQ1ScACgkQnVkHo1a+xU6EGgCglXTT5RUqR37uY0YdfcPgsYdR Pu8AnjTaO4AJlo6ZJd6fkmZuCpXN3WL0 =1hFJ -----END PGP SIGNATURE-----

Karanbir Singh

2:21 p.m.

On 02/03/2015 02:03 PM, Fabian Arrotin wrote:

...

On 03/02/15 14:38, Karanbir Singh wrote:

...
Hi,

...
At the end of the Dojo in Brussels, I had the chance to field the question to our contributor audience : how can we get security updates out to the user machines faster.

...
At the moment, things are setup like any other distro or large open source content network is : we rsync in stages, and external mirrors pickup every 4 to 6 hours, some external mirrors pickup from other external mirrors. Net result is that for a given update, it can be upto 16 to 18 hours before we get a majority content sync in front of most users.

So, we have to split the answer into two categories :

people using default yum repositories provided by

mirrorlist.centos.org : I just verified on the node producing all those mirrorlists the needed time to crawl all external mirrors and validate their contents : It takes maximum 3 hours to crawl all external mirrors for {5,6,7}/{os,updates,extras,centosplus}/{i386,x86_64}. So in the worst case scenario, supposing that we dropped a new rpm/metadata just at the start of the crawler process, we'd have to wait 6 hours (so waiting for second run to finish validating and pushing new mirrorlists)

this is already included in my 16 to 18 hrs estimate on how long it takes to get majority sanity on the mirror network.

...

people not using default yum repositories : nothing we can directly

do, as we don't control the repo they are using

people who know what they are doing will do whatever they want, and are not really impacted by changes we make for the default setup. Communication and promotion of those updates will likely be the only real value we can add for them.

...

...
In cases like the recent Glibc issue, 18 hrs can be a long time since the release ( remember, we already lag RHEL releases since our process starts once theirs ends ).

...
There were a couple of ideas that came up in the conversation at the Dojo, and then in the following conversations over the entire Fosdem weekend. The two that seemed most likely, easiest to implement and perhaps most robust, involved a chunk of the load moving to mirror.centos.org for some period of time. These are :

...
A) we setup a rapid update repo, that would be hosted on and run from mirror.centos.org exclusively. The yum repo definitions would still point at mirrorlist, however they would only expect centos.org urls in the baseurl stack from mirrorlist.centos.org; This would allow us to reduce the overall to-user-visibility in default centos linux installs to under an hour for content upto 250MB in size.

...
B) integrate the mirrorlist backend with the release mechanism in centos linux, so when there is a new updates pushed, all updates are then delivered via mirror.centos.org for the next 24 hrs. After this period, traffic reshapes to be delivered from the external mirrors by default.

...
The Key issue to note with (A) is that while we might push something to this rapid update repo, the same content will also be available in the regular updates/ repo. So once its starts showing up externally, traffic will naturally switch to using the updates repo from local mirrors ( using repo names and cost etc, we can influence repo priority where there is common content ).

So, let me add then something between (A) and (B) : as we control also the node producing the mirrorlists, why not having a parallel job on the same host, just crawling "on demand" the updates repo for a specific release when we know that we have to release "critical" updates. We'd then be able to validate directly in loop which mirrors would be validated for that specific package/repodata and so not having to wait multiple hours. At the same time, we can add mirror.centos.org in the mix, but already validated by default.

We can try that, it would reduce the time-to-check, but will have no impact on the sync rates for getting content out. So at best we'd be shaving off a few hours from the overall run, for a large chunk of compexity in the mirrorlist layers.

Unless we can find a huge hole with A, it seems simplest and easily executed without any code changes.

...

Reason why I'd not specifically like to see only mirror.centos.org nodes being used is that from time to time, we also lost some of those nodes, just because monthly quota was then used, and or NOC team thinking that a DoS was happening :-)

these are largely an automation and monitoring problem - both of which can be improved. If a machine is down, our dns should not be handing out that machine's IP on mirror.centos.org questions.

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

Karanbir Singh

2:47 p.m.

On 03/02/15 13:38, Karanbir Singh wrote:

...

The second important thing to note here is traffic capacity : we can deliver upto 20 Gbit/sec in the US and EU from centos.org - and around 8 Gbit/sec for everywhere else. Given that we prefer to offload traffic to external mirrors at the moment, its hard to estimate what the overall capacity requirements will be. Ideally, drpms should/would help.

worth keeping in mind that we have this capacity when not also in 'new release mode', when there is a snapshot or release going through, capacity would be drastically reduced.

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

Nux!

2:56 p.m.

How about a variation on A: - ask some of the main mirrors for push access, put those in order and make a new [centos-security] repo with a mirrorlist pointing just at them.

Lucian

-- Sent from the Delta quadrant using Borg technology!

Nux! www.nux.ro

----- Original Message -----

...

From: "Karanbir Singh" mail-lists@karan.org To: centos-devel@centos.org Sent: Tuesday, 3 February, 2015 13:38:28 Subject: [CentOS-devel] setting up an emergency update route

...

Hi,

At the end of the Dojo in Brussels, I had the chance to field the question to our contributor audience : how can we get security updates out to the user machines faster.

At the moment, things are setup like any other distro or large open source content network is : we rsync in stages, and external mirrors pickup every 4 to 6 hours, some external mirrors pickup from other external mirrors. Net result is that for a given update, it can be upto 16 to 18 hours before we get a majority content sync in front of most users.

In cases like the recent Glibc issue, 18 hrs can be a long time since the release ( remember, we already lag RHEL releases since our process starts once theirs ends ).

There were a couple of ideas that came up in the conversation at the Dojo, and then in the following conversations over the entire Fosdem weekend. The two that seemed most likely, easiest to implement and perhaps most robust, involved a chunk of the load moving to mirror.centos.org for some period of time. These are :

A) we setup a rapid update repo, that would be hosted on and run from mirror.centos.org exclusively. The yum repo definitions would still point at mirrorlist, however they would only expect centos.org urls in the baseurl stack from mirrorlist.centos.org; This would allow us to reduce the overall to-user-visibility in default centos linux installs to under an hour for content upto 250MB in size.

B) integrate the mirrorlist backend with the release mechanism in centos linux, so when there is a new updates pushed, all updates are then delivered via mirror.centos.org for the next 24 hrs. After this period, traffic reshapes to be delivered from the external mirrors by default.

The Key issue to note with (A) is that while we might push something to this rapid update repo, the same content will also be available in the regular updates/ repo. So once its starts showing up externally, traffic will naturally switch to using the updates repo from local mirrors ( using repo names and cost etc, we can influence repo priority where there is common content ).

The second important thing to note here is traffic capacity : we can deliver upto 20 Gbit/sec in the US and EU from centos.org - and around 8 Gbit/sec for everywhere else. Given that we prefer to offload traffic to external mirrors at the moment, its hard to estimate what the overall capacity requirements will be. Ideally, drpms should/would help.

Depending on how everyone feels around this, I'd like to go ahead and kick off implementation around this. Lets get it done before the next big security update comes through.

Regards

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc _______________________________________________ CentOS-devel mailing list CentOS-devel@centos.org http://lists.centos.org/mailman/listinfo/centos-devel

Fabian Arrotin

3:09 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 03/02/15 15:56, Nux! wrote:

...

How about a variation on A: - ask some of the main mirrors for push access, put those in order and make a new [centos-security] repo with a mirrorlist pointing just at them.

Lucian

well, asking "push access" can be problematic for various reasons (and I clearly would understand why they'd all answer "no"). What I was thinking about (and discussing that at Fosdem with some other distributions Infra team members) was something around something light (like mqtt) or something light that can be used a simple message queue and we'd publish, and all external mirrors could then just subscribe and take action.

Without going that far, a simple file that we'd drop at a specific place on the msync node that can be checked/parsed by external mirrors in a frequent cron job running on those nodes (we can even just have a simple bash example, so nothing really to install as a requirement for those 580+ external mirrors)

The interesting thing is that we can also mix all current proposals all together too (so an "emergency updates" repo, using only centos.org nodes, a different crawler process running faster and only dedicated for a specific release/updates and doing that in loop, and that external trigger running on external mirrors

- --

Fabian Arrotin The CentOS Project | http://www.centos.org gpg key: 56BEC54E | twitter: @arrfab

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlTQ5K4ACgkQnVkHo1a+xU4VvgCdFdWdqoH3Xv1CQnpaH0NoVJ8o 38wAn3J7wb3Mr9R1AI37/owG5pKR+nDR =rA9B -----END PGP SIGNATURE-----

Stephen John Smoogen

6:59 p.m.

On 3 February 2015 at 07:56, Nux! nux@li.nux.ro wrote:

...

How about a variation on A:

ask some of the main mirrors for push access, put those in order and

make a new [centos-security] repo with a mirrorlist pointing just at them.

Lucian

Speaking from working on the Red Hat/Fedora side of mirroring for a while... mirrors do not like to be pushed to. They also do not like having to run software that would initiate a pull if notified of new content. For most of them, this is a spare time, feel good item. They have no time and usually have to spend a good portion of the year explaining to their bosses why they even have a mirror to the internet since that costs someone a lot of money somewhere. So anything which adds to that workload is a minus.

-- Stephen J Smoogen.

Manuel Wolfshant

7:08 p.m.

On 3 februarie 2015 20:59:39 EET, Stephen John Smoogen smooge@gmail.com wrote:

...

On 3 February 2015 at 07:56, Nux! nux@li.nux.ro wrote:

...
How about a variation on A:

ask some of the main mirrors for push access, put those in order

and

...
make a new [centos-security] repo with a mirrorlist pointing just at

them.

...
Lucian

Speaking from working on the Red Hat/Fedora side of mirroring for a while... mirrors do not like to be pushed to. They also do not like having to run software that would initiate a pull if notified of new content. For most of them, this is a spare time, feel good item. They have no time and usually have to spend a good portion of the year explaining to their bosses why they even have a mirror to the internet since that costs someone a lot of money somewhere. So anything which adds to that workload is a minus.

With my mirror admin hat on, even if we do not accept pushes/notifications we'll gladly poll any designated upstream mirror with any frequency considered suitable by you . And we'll also host any additional repo (if needed) with any (decent) size. The Debian mirror is already larger. Much larger :)

Nux!

7:55 p.m.

+1 what wolfy said

-- Sent from the Delta quadrant using Borg technology!

Nux! www.nux.ro

----- Original Message -----

...

From: "Manuel Wolfshant" wolfy@nobugconsulting.ro To: "The CentOS developers mailing list." centos-devel@centos.org Sent: Tuesday, 3 February, 2015 19:08:39 Subject: Re: [CentOS-devel] setting up an emergency update route

...

On 3 februarie 2015 20:59:39 EET, Stephen John Smoogen smooge@gmail.com wrote:

...
On 3 February 2015 at 07:56, Nux! nux@li.nux.ro wrote:

...
How about a variation on A:

ask some of the main mirrors for push access, put those in order

and

...
make a new [centos-security] repo with a mirrorlist pointing just at

them.

...
Lucian

Speaking from working on the Red Hat/Fedora side of mirroring for a while... mirrors do not like to be pushed to. They also do not like having to run software that would initiate a pull if notified of new content. For most of them, this is a spare time, feel good item. They have no time and usually have to spend a good portion of the year explaining to their bosses why they even have a mirror to the internet since that costs someone a lot of money somewhere. So anything which adds to that workload is a minus.

With my mirror admin hat on, even if we do not accept pushes/notifications we'll gladly poll any designated upstream mirror with any frequency considered suitable by you . And we'll also host any additional repo (if needed) with any (decent) size. The Debian mirror is already larger. Much larger :)

CentOS-devel mailing list CentOS-devel@centos.org http://lists.centos.org/mailman/listinfo/centos-devel

Karanbir Singh

8:58 p.m.

On 02/03/2015 07:08 PM, Manuel Wolfshant wrote:

...

On 3 februarie 2015 20:59:39 EET, Stephen John Smoogen smooge@gmail.com wrote:

...
On 3 February 2015 at 07:56, Nux! nux@li.nux.ro wrote:

...
How about a variation on A:

ask some of the main mirrors for push access, put those in order

and

...
make a new [centos-security] repo with a mirrorlist pointing just at

them.

...
Lucian

Speaking from working on the Red Hat/Fedora side of mirroring for a while... mirrors do not like to be pushed to. They also do not like having to run software that would initiate a pull if notified of new content. For most of them, this is a spare time, feel good item. They have no time and usually have to spend a good portion of the year explaining to their bosses why they even have a mirror to the internet since that costs someone a lot of money somewhere. So anything which adds to that workload is a minus.

With my mirror admin hat on, even if we do not accept pushes/notifications we'll gladly poll any designated upstream mirror with any frequency considered suitable by you . And we'll also host any additional repo (if needed) with any (decent) size. The Debian mirror is already larger. Much larger :)

repeated polling is counter productive. for the 6 times the high-prio push was needed in the last year, its a waste to destroy mirror cache's every 10 min through the entire year.

having dedicated nodes to just push rsync targets is also bad - since those machines then dont deliver any user facing service ( or bandwdith ) for most of the time.

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

Jeff Sheltren

4 Feb 4 Feb

3:01 p.m.

On Tue, Feb 3, 2015 at 12:58 PM, Karanbir Singh mail-lists@karan.org wrote:

...

repeated polling is counter productive. for the 6 times the high-prio push was needed in the last year, its a waste to destroy mirror cache's every 10 min through the entire year.

What cache are you referring to specifically (filesystem?, reverse proxy cache? other?)?

Obviously the rsync method where each mirror pretty much "does their own thing" is dated and not optimal. The "hi, I just updated my mirror, here's what I have currently" script portion of MirrorManager can at least help on the polling side so that you have a more accurate and timely idea of which mirrors are up to date. Leveraging that, or similar, may be a small change that could help move things in the right direction (and may or may not be part of a long-term way to improve distro mirroring).

For starters, why not select a core group (10-20? Just making up a number here, but get a good geographic/network spread) of external "tier 1" mirrors and ask them to update more frequently (one hour seems reasonable to me, and as an ex-mirror-admin I don't think that is asking too much). And scan those more frequently (or use something similar to the MirrorManager "I just updated" script) so that the status of those mirrors is well known and they can be easily flagged if they are not being updated.

Non "tier 1" mirrors are asked to pull from the tier 1 mirrors, and are asked to update at least every X hours. I'm making the assumption that one hour may be too frequent for some mirror admins, but perhaps push them into updating at least every 2 or 3 hours. These mirrors could be scanned for status less frequently than the tier 1 mirrors because you know they will be at least 2 hours behind or so.

Any other mirrors (not tier 1 or tier 2) are either dropped completely from the official mirror list or are kept on a separate "we don't endorse these, but here are some mirrors that may be fast for you to use, although perhaps slightly out of date).

I think just that bit of shrinking the update window for mirrors could make quite a difference.

I would argue that people who demand a faster update window than 3-4 hours should look at a paid, supported alternative. That said, I don't want to use that as an argument against making the updates process as fast as we possibly can.

-Jeff

Les Mikesell

3:55 p.m.

On Wed, Feb 4, 2015 at 9:01 AM, Jeff Sheltren jeff@tag1consulting.com wrote:

...

Obviously the rsync method where each mirror pretty much "does their own thing" is dated and not optimal.

I've always thought that the default mirrorlist scheme in the yum configuration wildly increases the load on mirror sites because when sites update multiple machines, any local http cache they go through will pull a copy from every mirror in the list instead of re-using the first copy.

But, I'm even more surprised that someone hasn't glued a bittorrent-like transport into yum to automatically chunk stuff up and pull from multiple places at once with the mirror caching taking care of itself.

-- Les Mikesell lesmikesell@gmail.com

Karanbir Singh

5 Feb 5 Feb

9:32 a.m.

On 02/04/2015 03:01 PM, Jeff Sheltren wrote:

...

On Tue, Feb 3, 2015 at 12:58 PM, Karanbir Singh <mail-lists@karan.org mailto:mail-lists@karan.org> wrote:
repeated polling is counter productive. for the 6 times the high-prio
push was needed in the last year, its a waste to destroy mirror cache's
every 10 min through the entire year.
What cache are you referring to specifically (filesystem?, reverse proxy cache? other?)?

filesystem cache's - getting them up and keeping them warm is a massive impact on deliverability of content from the mirror nodes. A very large number of machines still run off 1 or 2 hdd's typically in a raid1, but they can easily deliver more than a couple of hundred megs of data. A complete rsync over stale content kills that.

John Hawley's paper on mirrors and filesystem cache's is largely still relevant ( ~ 7 years down the road from when it was written ? )

the main issue is that while there are only a few updates, the rsync will trawl the entire tree, including components that are potentially 3 or 4 updates behind - for a 100GB on-disk size payload.

...

Obviously the rsync method where each mirror pretty much "does their own thing" is dated and not optimal. The "hi, I just updated my mirror, here's what I have currently" script portion of MirrorManager can at least help on the polling side so that you have a more accurate and timely idea of which mirrors are up to date. Leveraging that, or similar, may be a small change that could help move things in the right direction (and may or may not be part of a long-term way to improve distro mirroring).

we tried this - people lied. Not everyone runs entire mirrors, and having this run client side dramatically increases the chances for a dirty mirror being accepted in. If we validate mirrors, it really must happen from an external source. Maybe publishing a checksum or some metadata that is used as a component of the overall yes/no might work.

...

For starters, why not select a core group (10-20? Just making up a number here, but get a good geographic/network spread) of external "tier 1" mirrors and ask them to update more frequently (one hour seems reasonable to me, and as an ex-mirror-admin I don't think that is asking too much). And scan those more frequently (or use something similar to the MirrorManager "I just updated" script) so that the status of those mirrors is well known and they can be easily flagged if they are not being updated.

at this point, why not just deploy a distributed gluster setup and ask them to join as replica's ?

...

Non "tier 1" mirrors are asked to pull from the tier 1 mirrors, and are asked to update at least every X hours. I'm making the assumption that one hour may be too frequent for some mirror admins, but perhaps push them into updating at least every 2 or 3 hours. These mirrors could be scanned for status less frequently than the tier 1 mirrors because you know they will be at least 2 hours behind or so.

Any other mirrors (not tier 1 or tier 2) are either dropped completely from the official mirror list or are kept on a separate "we don't endorse these, but here are some mirrors that may be fast for you to use, although perhaps slightly out of date).

I think just that bit of shrinking the update window for mirrors could make quite a difference.

I would argue that people who demand a faster update window than 3-4 hours should look at a paid, supported alternative. That said, I don't want to use that as an argument against making the updates process as fast as we possibly can.

What you say here makes sense, but it works on the assumption that rsync over massive tree's is going to work - it doesnt. I think for increasingly larger sets of data, rsync as a mechanism to send down cascading tree's is just broken. We end up with large numbers of machines doing no real user facing content, just iterating over the same content and comparing with states of remote machines.

ofcourse - all this is orthogonal to the 'urgent updates' repo. We need to solve and find a better way to get content out for the entire tree's - but do we need to have that in place before we do this 'urgent updates' repo ? Can we not just have that run from mirror.centos.org ( that has a 10 min update delta .. ), while we workout what the larger solution might be ?

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

Manuel Wolfshant

10:37 a.m.

On 02/05/2015 11:32 AM, Karanbir Singh wrote:

...

ofcourse - all this is orthogonal to the 'urgent updates' repo. We need to solve and find a better way to get content out for the entire tree's

but do we need to have that in place before we do this 'urgent

updates' repo ? Can we not just have that run from mirror.centos.org ( that has a 10 min update delta .. ), while we workout what the larger solution might be ?

how about - setup (server side ) an additional "centos-urgent-packages-that-should-have-been-installed-yesterday" repo - set mirrormanager to provide mirror.c.o exclusively as source for those packages - push an updated centos-release with the above repo default enabled -hardlink the packages to centos-updates where from they will be eventually be mirrored by all the 2nd and 3rd tier mirrors

Eventually after "should have been installed yesterday' changes into "last-week's-content"- one can delete the content of this repo, since the content would have been made available a long time ago via the regular updates channel.

Karanbir Singh

10:48 a.m.

On 02/05/2015 10:37 AM, Manuel Wolfshant wrote:

...

On 02/05/2015 11:32 AM, Karanbir Singh wrote:

...
ofcourse - all this is orthogonal to the 'urgent updates' repo. We need to solve and find a better way to get content out for the entire tree's

but do we need to have that in place before we do this 'urgent

updates' repo ? Can we not just have that run from mirror.centos.org ( that has a 10 min update delta .. ), while we workout what the larger solution might be ?

how about

setup (server side ) an additional

"centos-urgent-packages-that-should-have-been-installed-yesterday" repo

set mirrormanager to provide mirror.c.o exclusively as source for

those packages

push an updated centos-release with the above repo default enabled

-hardlink the packages to centos-updates where from they will be eventually be mirrored by all the 2nd and 3rd tier mirrors

Eventually after "should have been installed yesterday' changes into "last-week's-content"- one can delete the content of this repo, since the content would have been made available a long time ago via the regular updates channel.

This is pretty much Scenario A in my original email, we use mirror.c.o to deliver the content real-quick, and if the updates repo has higher precidence, when the same rpms show up in there ( 8+ hrs later or wheverver ) people will get it from there instead.

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

Karanbir Singh

10:49 a.m.

On 02/05/2015 10:37 AM, Manuel Wolfshant wrote:

...

Eventually after "should have been installed yesterday' changes into "last-week's-content"- one can delete the content of this repo, since the content would have been made available a long time ago via the regular updates channel.

not sure about this deleting thing, will need a lot more thought.

Also, folks running priorities will need to scope up how this is going to impact their setup ( and their ability to get something that updates base from a repo that isnt called updates )

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

Jeff Sheltren

2:47 p.m.

On Thu, Feb 5, 2015 at 1:32 AM, Karanbir Singh mail-lists@karan.org wrote:

...

On 02/04/2015 03:01 PM, Jeff Sheltren wrote:

...
ofcourse - all this is orthogonal to the 'urgent updates' repo. We need to solve and find a better way to get content out for the entire tree's

but do we need to have that in place before we do this 'urgent

updates' repo ? Can we not just have that run from mirror.centos.org ( that has a 10 min update delta .. ), while we workout what the larger solution might be ?

Yes, sorry for taking this off topic. I actually talked to John about this yesterday, and I think we're both in agreement with your initial proposal in this thread.

-Jeff

Pierre-Yves Chibon

9:42 p.m.

On Thu, Feb 05, 2015 at 09:32:35AM +0000, Karanbir Singh wrote:

...

...
Obviously the rsync method where each mirror pretty much "does their own thing" is dated and not optimal. The "hi, I just updated my mirror, here's what I have currently" script portion of MirrorManager can at least help on the polling side so that you have a more accurate and timely idea of which mirrors are up to date. Leveraging that, or similar, may be a small change that could help move things in the right direction (and may or may not be part of a long-term way to improve distro mirroring).

we tried this - people lied. Not everyone runs entire mirrors, and having this run client side dramatically increases the chances for a dirty mirror being accepted in. If we validate mirrors, it really must happen from an external source. Maybe publishing a checksum or some metadata that is used as a component of the overall yes/no might work.

For the record, this script primary use-case are private mirrors, those that are within a firewall or network that we cannot reach, therefore, we have no way of knowing if they are up to date or not. The only is to force them to tell us if they are up to date and that is by running this script.

I believe all the publicly accessible mirrors are monitored by our crawler.

Pierre

Tim Verhoeven

7 Feb 7 Feb

3:12 p.m.

Hi,

I've been thinking a bit about this. The best solution IMHO besides building your our CDN, which is indeed a bit over the top for this, is to push these updates instead of working with a pull method. So would it be possible to find some mirrors that would allow us to push packages into our repo's on their servers. In case of releases that need to go out quickly we could use a seperate mirrorlist that only includes our servers and the mirrors that allows us to push to. So we can move the needed packages our quickly and let users get them fast. Later as the other mirrors sync up we just go back to the normal mirrorlist.

Stupid idea or not?

Kind regards, Tim

-- Tim Verhoeven - tim.verhoeven.be@gmail.com - 0479 / 88 11 83 Hoping the problem magically goes away by ignoring it is the "microsoft approach to programming" and should never be allowed. (Linus Torvalds)

Stephen John Smoogen

8:44 p.m.

On 7 February 2015 at 08:12, Tim Verhoeven tim.verhoeven.be@gmail.com wrote:

...

Hi,

I've been thinking a bit about this. The best solution IMHO besides building your our CDN, which is indeed a bit over the top for this, is to push these updates instead of working with a pull method. So would it be possible to find some mirrors that would allow us to push packages into our repo's on their servers. In case of releases that need to go out quickly we could use a seperate mirrorlist that only includes our servers and the mirrors that allows us to push to. So we can move the needed packages our quickly and let users get them fast. Later as the other mirrors sync up we just go back to the normal mirrorlist.

Stupid idea or not?

I don't think it is "stupid", but it is overly simplified. Just going off of the EPEL checkins to mirrorlist there are at least 400k->600k active systems which are going to be checking hourly for updates for an emergency update. The number of mirrors who are going to allow a push system are going to have to be large enough to deal with the thundering herd problem when an update occurs and 500k systems checkin at 10 after the hour (seems like a common time for boxes which check in hourly) all see there is a new update and start pulling from it.

In the many years of mirror administration, there have been multiple requests for some sort of push system to allow for better speedy downloads. Out of the thousands of mirrors, the number who say they will do it are usually less than 10. And none of them the guys with very large bandwidth.

Take problem A add it to problem B and you end up with a recipe for complete meltdown of a service you are hoping to help people better.

Problem A isn't something that anyone can fix. The hundreds of thousands to millions of systems out there that look for updates regularly aren't something you can administer to. You can give them premade crontabs, etc etc and you will find that 10%-15% of the people who were checking in at 10 after the hour now are doing ti around the hour.. but you still have a huge lump at 10 after the hour. [Mainly because sysadmins like to use the script they know has worked for the last 10+ years versus some god knows who tested it script.]

Problem B is one that could possibly dealt with but it is not just convincing the mirror administrators but their management that this is an acceptable risk in security, network bandwidth costs, and other factors. That takes a lot of social capital, marketing and general sales skills. If you have them, then you have a better chance of accomplishing it than most system administrators.

-- Stephen J Smoogen.

Nico Kadel-Garcia

8 Feb 8 Feb

5:49 a.m.

On Sat, Feb 7, 2015 at 3:44 PM, Stephen John Smoogen smooge@gmail.com wrote:

...

On 7 February 2015 at 08:12, Tim Verhoeven tim.verhoeven.be@gmail.com wrote:

...
Hi,

I've been thinking a bit about this. The best solution IMHO besides building your our CDN, which is indeed a bit over the top for this, is to push these updates instead of working with a pull method. So would it be possible to find some mirrors that would allow us to push packages into our repo's on their servers. In case of releases that need to go out quickly we could use a seperate mirrorlist that only includes our servers and the mirrors that allows us to push to. So we can move the needed packages our quickly and let users get them fast. Later as the other mirrors sync up we just go back to the normal mirrorlist.

Stupid idea or not?

I don't think it is "stupid", but it is overly simplified. Just going off of the EPEL checkins to mirrorlist there are at least 400k->600k active systems which are going to be checking hourly for updates for an emergency update. The number of mirrors who are going to allow a push system are going to have to be large enough to deal with the thundering herd problem when an update occurs and 500k systems checkin at 10 after the hour (seems like a common time for boxes which check in hourly) all see there is a new update and start pulling from it.

There are approaches that could make it more effective. One of them is an inventory based update mechanism: A server side flag, available to clients, to report changes in the repository and allow clients to efficiently update by scanning that flag for new files and repodata information could be far more efficient for many sites.

One of the subtler difficulties, and this is being ignored by the Fedora migrations to dnf, is the cost of the metadata updates. The repodata alone is over 500 MBytes for CentOS 7. This is *nsane* to keep transmitting for every micro-update or critical update. Scaled out across a bulky local cluster and simply running "yum check-ipdate" can saturate your bandwidth, and has done so for me. That's why I use local mirrors when possible. But then, hey, my local mirror has to pull these alerts *all the time*, which puts it in a constant state of churn for the repository information. It gets out of hand very quickly.

The underlying solution to the bulky repodata is to *stop using monolithic repodata*. Switch to a much, much lighter weight repodata and stop trying to invent new, bulky, confusing features such as "Recommends" and concentrate on splitting it much like "apt" splits up its repositories. One package, one small header file, if the package updates update *that* header file instead of a monolithic database.

I realize that's not going to happen right now: too much work as hbeen invested in yum and dnf as they exist to do this. But it's worth keeping in mind, it sets a half Gig transmisison cost to *any* repository updates of the main OS repositories.

Nico Kadel-Garcia

5:53 a.m.

On Sun, Feb 8, 2015 at 12:49 AM, Nico Kadel-Garcia nkadel@gmail.com wrote:

...

I realize that's not going to happen right now: too much work as hbeen invested in yum and dnf as they exist to do this. But it's worth keeping in mind, it sets a half Gig transmisison cost to *any* repository updates of the main OS repositories.

Rechecked my numbers: it's less than 50 Meg. And it's what I get for using a different tool to check a repo. It's still a significant noise floor with frequent churning updates, but nowhere near as bad.

Karanbir Singh

11:08 a.m.

On 07/02/15 15:12, Tim Verhoeven wrote:

...

Hi,

I've been thinking a bit about this. The best solution IMHO besides building your our CDN, which is indeed a bit over the top for this, is to push these updates instead of working with a pull method. So would it be possible to find some mirrors that would allow us to push packages into our repo's on their servers. In case of releases that need to go out quickly we could use a seperate mirrorlist that only includes our servers and the mirrors that allows us to push to. So we can move the needed packages our quickly and let users get them fast. Later as the other mirrors sync up we just go back to the normal mirrorlist.

this point was agued a bit earlier as well, it has legs and I think we might be able to make it work. What we will however need, just off the top of my head : 1) a way for mirrors to sign up to this 2) a way for us to deliver the requirements ( a sanity check script, a rsync target, a ssh-key pub key that we'd hold priv key for etc ). 3) a way to make sure we are sanity testing this process every few hours / days / weeks to make sure that when its needed, it works. 4) a way to feedback the 'X mirror' now has 'content delivered' - and get that metadata into the mirrorlist hosts on ipv4 and ipv6.

(1) might be just a case of having people signup via bugs.centos.org and have (2) be delivered as a wiki article, and have an automated script that can then test it for the user before its added to the checker/delivery/notify scripts. (3) and (4) are going to need a bit more work.

also, we are going to need some specifics for mirrors to need to meet for including, eg: geo spread, capacity to host, manage etc.

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

centoslistmail＠gmail.com

4 Feb 4 Feb

4:04 p.m.

On Feb 03 8:58pm, Karanbir Singh wrote:

...

repeated polling is counter productive. for the 6 times the high-prio push was needed in the last year, its a waste to destroy mirror cache's every 10 min through the entire year.

having dedicated nodes to just push rsync targets is also bad - since those machines then dont deliver any user facing service ( or bandwdith ) for most of the time.

Since the collection of mirror hosts is really just a large distributed system, it would be prudent to think about in that context and not worry (at this point) about such minor implementation-specific details.

The overview (10,000 ft view) becomes simply the message layer and the transport layer. Rsync is perfectly sufficient for the transport layer. The problem being discussed, however, is mostly relevant to the message layer. That layer is simply "when is there new stuff to grab?". The problem is muddled by the fact that rsync is being used as a part of the message layer, too, and that is not optimal. Rsync should be able to say:

"I am grabbing that which is different"

Instead of saying:

"If there is something different, I will grab it"

The second sentence is primarily a question of when, not a question of what. Rsync is a very expensive way of trying to ask when. What is needed is a better (not time-based) method of triggering rsync. A simple timestamp check of a file grabbed via curl, while not exactly robust, would suffice as a trigger. A high rate of polling for such a tiny thing would be low cost and then logic, based on that poll, would determine if rsync is triggered. Other options, like a rabbitMQ-based queue, could be very robust in that it can coordinate the external rsync processes to manage a thundering herd and lessen the chance of inadvertent DDoS.

Just my 2¢.

-- jc

Karanbir Singh

5 Feb 5 Feb

9:37 a.m.

On 02/04/2015 04:04 PM, centoslistmail@gmail.com wrote:

...

On Feb 03 8:58pm, Karanbir Singh wrote:

...
repeated polling is counter productive. for the 6 times the high-prio push was needed in the last year, its a waste to destroy mirror cache's every 10 min through the entire year.

having dedicated nodes to just push rsync targets is also bad - since those machines then dont deliver any user facing service ( or bandwdith ) for most of the time.

Since the collection of mirror hosts is really just a large distributed system, it would be prudent to think about in that context and not worry (at this point) about such minor implementation-specific details.

this is not a minor issue... being able to saturate links from our side, with a focus on what-matters-when, allowed us to reduce the overall mirror seed time from 7 days to just under 2.5 days for a major release - this is inspite of the fact that we seed almost 4,000 external mirrors at point of release.

but again, this isnt the question at hand!

...

The overview (10,000 ft view) becomes simply the message layer and the transport layer. Rsync is perfectly sufficient for the transport layer. The problem being discussed, however, is mostly relevant to the message layer. That layer is simply "when is there new stuff to grab?". The problem is muddled by the fact that rsync is being used as a part of the message layer, too, and that is not optimal. Rsync should be able to say:

"I am grabbing that which is different"

Instead of saying:

"If there is something different, I will grab it"

The second sentence is primarily a question of when, not a question of what. Rsync is a very expensive way of trying to ask when. What is needed is a better (not time-based) method of triggering rsync. A simple timestamp check of a file grabbed via curl, while not exactly robust, would suffice as a trigger. A high rate of polling for such a tiny thing would be low cost and then logic, based on that poll, would determine if rsync is triggered. Other options, like a rabbitMQ-based queue, could be very robust in that it can coordinate the external rsync processes to manage a thundering herd and lessen the chance of inadvertent DDoS.

if we are able to solve this: what changed since i saw you last, without needing to walk and compare metadata on every file across a 100gb corpus, we would have quite a nice solution indeed. But how does one implement that ?

a reverse opportunity driven cache replacing mirror nodes ? so we have a CDN of sorts, with a on-demand repo level expunge ?

-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc

Les Mikesell

3 Feb 3 Feb

4:26 p.m.

On Tue, Feb 3, 2015 at 7:38 AM, Karanbir Singh mail-lists@karan.org wrote:

...

Hi,

At the end of the Dojo in Brussels, I had the chance to field the question to our contributor audience : how can we get security updates out to the user machines faster.

At the moment, things are setup like any other distro or large open source content network is : we rsync in stages, and external mirrors pickup every 4 to 6 hours, some external mirrors pickup from other external mirrors. Net result is that for a given update, it can be upto 16 to 18 hours before we get a majority content sync in front of most users.

Why don't you combine two concepts here. Delegate a separate set of 'security-only' update repositories that are fast, high-capacity sites. Put only the critical updates there, along with any dependencies.needed for yum to complete the update. Let someone with access to that data that you can't republish decide which updates are security related.

Not only does this reduce the needed fan-out, but it provides a much better case for leaving auto-updates enabled on that repository or at least scheduling an update at the first possible chance since it would introduce fewer arbitrary and unnecessary changes.

-- Les Mikesell lesmikesell@gmail.com

Nux!

4:30 p.m.

Yeah, a separate repo like that that some of the mirrors can sync with every say 10 minutes .. could work?

Lucian

-- Sent from the Delta quadrant using Borg technology!

Nux! www.nux.ro

----- Original Message -----

...

From: "Les Mikesell" lesmikesell@gmail.com To: "The CentOS developers mailing list." centos-devel@centos.org Sent: Tuesday, 3 February, 2015 16:26:36 Subject: Re: [CentOS-devel] setting up an emergency update route

...

On Tue, Feb 3, 2015 at 7:38 AM, Karanbir Singh mail-lists@karan.org wrote:

...
Hi,

At the end of the Dojo in Brussels, I had the chance to field the question to our contributor audience : how can we get security updates out to the user machines faster.

At the moment, things are setup like any other distro or large open source content network is : we rsync in stages, and external mirrors pickup every 4 to 6 hours, some external mirrors pickup from other external mirrors. Net result is that for a given update, it can be upto 16 to 18 hours before we get a majority content sync in front of most users.

Why don't you combine two concepts here. Delegate a separate set of 'security-only' update repositories that are fast, high-capacity sites. Put only the critical updates there, along with any dependencies.needed for yum to complete the update. Let someone with access to that data that you can't republish decide which updates are security related.

Not only does this reduce the needed fan-out, but it provides a much better case for leaving auto-updates enabled on that repository or at least scheduling an update at the first possible chance since it would introduce fewer arbitrary and unnecessary changes.

-- Les Mikesell lesmikesell@gmail.com _______________________________________________ CentOS-devel mailing list CentOS-devel@centos.org http://lists.centos.org/mailman/listinfo/centos-devel

3772

Age (days ago)

3777

Last active (days ago)

devel@lists.centos.org

26 comments

11 participants

tags (0)

participants (11)

centoslistmail＠gmail.com
Fabian Arrotin
Jeff Sheltren
Karanbir Singh
Les Mikesell
Manuel Wolfshant
Nico Kadel-Garcia
Nux!
Pierre-Yves Chibon
Stephen John Smoogen
Tim Verhoeven