Re: [CentOS-devel] setting up an emergency update route

5 Feb 2015


      On 02/04/2015 04:04 PM, centoslistmail@gmail.com wrote:
...
On Feb 03  8:58pm, Karanbir Singh wrote:
...
repeated polling is counter productive. for the 6 times the high-prio
push was needed in the last year, its a waste to destroy mirror cache's
every 10 min through the entire year.
having dedicated nodes to just push rsync targets is also bad - since
those machines then dont deliver any user facing service ( or bandwdith
) for most of the time.
Since the collection of mirror hosts is really just a large distributed
system, it would be prudent to think about in that context and not worry
(at this point) about such minor implementation-specific details.
this is not a minor issue... being able to saturate links from our side,
with a focus on what-matters-when, allowed us to reduce the overall
mirror seed time from 7 days to just under 2.5 days for a major release
- this is inspite of the fact that we seed almost 4,000 external mirrors
at point of release.
but again, this isnt the question at hand!
...
The overview (10,000 ft view) becomes simply the message layer and the
transport layer. Rsync is perfectly sufficient for the transport layer. 
The problem being discussed, however, is mostly relevant to the message
layer. That layer is simply "when is there new stuff to grab?". The
problem is muddled by the fact that rsync is being used as a part of the
message layer, too, and that is not optimal. Rsync should be able to say:
"I am grabbing that which is different"
Instead of saying:
"If there is something different, I will grab it"
The second sentence is primarily a question of when, not a question of
what. Rsync is a very expensive way of trying to ask when. What is
needed is a better (not time-based) method of triggering rsync. A simple
timestamp check of a file grabbed via curl, while not exactly robust,
would suffice as a trigger. A high rate of polling for such a tiny thing
would be low cost and then logic, based on that poll, would determine if
rsync is triggered. Other options, like a rabbitMQ-based queue, could be
very robust in that it can coordinate the external rsync processes to
manage a thundering herd and lessen the chance of inadvertent DDoS.
if we are able to solve this: what changed since i saw you last, without
needing to walk and compare metadata on every file across a 100gb
corpus, we would have quite a nice solution indeed. But how does one
implement that ?
a reverse opportunity driven cache replacing mirror nodes ? so we have a
CDN of sorts, with a on-demand repo level expunge ?
-- 
Karanbir Singh
+44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh
GnuPG Key : http://www.karan.org/publickey.asc

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [CentOS-devel] setting up an emergency update route