[CentOS-devel] setting up an emergency update route

centoslistmail at gmail.com

centoslistmail at gmail.com
Wed Feb 4 16:04:28 UTC 2015


On Feb 03  8:58pm, Karanbir Singh wrote:
>
>repeated polling is counter productive. for the 6 times the high-prio
>push was needed in the last year, its a waste to destroy mirror cache's
>every 10 min through the entire year.
>
>having dedicated nodes to just push rsync targets is also bad - since
>those machines then dont deliver any user facing service ( or bandwdith
>) for most of the time.

Since the collection of mirror hosts is really just a large distributed 
system, it would be prudent to think about in that context and not worry 
(at this point) about such minor implementation-specific details.

The overview (10,000 ft view) becomes simply the message layer and the 
transport layer. Rsync is perfectly sufficient for the transport layer.  
The problem being discussed, however, is mostly relevant to the message 
layer. That layer is simply "when is there new stuff to grab?". The 
problem is muddled by the fact that rsync is being used as a part of the 
message layer, too, and that is not optimal. Rsync should be able to 
say:

"I am grabbing that which is different"

Instead of saying:

"If there is something different, I will grab it"

The second sentence is primarily a question of when, not a question of 
what. Rsync is a very expensive way of trying to ask when. What is 
needed is a better (not time-based) method of triggering rsync. A simple 
timestamp check of a file grabbed via curl, while not exactly robust, 
would suffice as a trigger. A high rate of polling for such a tiny thing 
would be low cost and then logic, based on that poll, would determine if 
rsync is triggered. Other options, like a rabbitMQ-based queue, could be 
very robust in that it can coordinate the external rsync processes to 
manage a thundering herd and lessen the chance of inadvertent DDoS.

Just my 2¢.

-- 
jc



More information about the CentOS-devel mailing list