[CentOS-mirror] Chinese IPs - Mirror Stats

Sat Jan 23 00:12:21 UTC 2010
Peter Poeml <poeml at cmdline.net>

On Fri, Jan 22, 2010 at 12:23:07 +0000, Karanbir Singh wrote:
> On 01/22/2010 12:11 PM, Prof. P. Sriram wrote:
> > I believe a correction might be in order - we have made it non-usable for
> > those that have 1 ip address and want to download at a rate exceeding 5
> > active connections per minute. Do you know of any such organizations?
> 
> yes, lots! including almost every office environment in the SME setup. 
> Many people run development and testing VM's / machines inside their 
> offices - and almost all have a small set of adsl links coming in ( in 
> EU and US atleast ), that they use for all outbound internet 
> connectivity behind a NAT setup. In many cases, yum-cron like jobs will 
> kickoff at very similar times across an organisation.

I agree that 5 connections per IP is a little slim. However, given you
apply such limits to the files where it matters (large files), it's
certainly okay to expect from such companies or regional setups to set
up a mirror for them, in their own best interest. 

In my findings, it wasn't necessary though to go lower than 20 with this
kind of restriction, and I'm quite sure that this leaves enough headroom
for the type of setups that you mention.

If it really poses a problem, it would be trivial to enhance the Apache
module to look not only at IP + number of connections, but to key this
to URL+User-Agent.

Yes, I'm postulating that these excessive parallel connections are _not_
the result of some evil mind, but purely the result of misinformation on
the side of users (naively tweaking the button that is "supposed" to
make it faster (and it works to some extent for the)). In fact, there'll
be some good amount of desperation involved that causes people to try
out these kind of extreme settings. Nobody in the better-connected world
would ever see the need to do so. But Chinese users need to go through
needle eyes...

(I have virtually never seen (and virtually never heard) of deliberate
DoS attacks against open source mirrors; most issues seem to be
misconfiguration or broken software; anyone else?)

> > Shouldn't they be enhancing their connectivity?
> 
> an example - adsl2+ brings in approx 16Mbps downstram, thats plenty of 
> connectivity for most offices with <= 50 employes who mostly only do 
> :80/:443 sort of traffic, with some other things like :22 and maybe 
> rsync. They should perhaps consider setting up local repo's within their 
> facility, but many lack the resources to do so.
> 
> > If you know of any package that provides this enhanced functionality, I
> > would be happy to implement that instead of our current scheme.
> 
> I personally dont. But if its a case of watching a log file, should not 
> be hard to implement. However, the problem of things like repomd.xml etc 
> still persists.

I doubt that the negative effects of massive parallel connections occur
with repomd.xml files -- so there wouldn't be a need to impose the
limitation on them in the first place, or what do you think?

> How about turning off 'RANGE' requests in httpd ? is that an option.

I recommend against it, because even though Range requests are not
mandatory in HTTP/1.1 to be supported by servers, they are so
universally supported (in default configurations) that there is certain
expectation on the client side that they _will_ suppor it. If a server
doesn't, it can lead to ugly surprises on the client side (pulling
gigabytes of data instead of a small chunk into memory), and also the
server might end up delivering more data than it would otherwise.

Some people have made claims that range requests have a negative effect
on buffer caches (and proposed to switch them off for that reason0, but
from what I see it doesn't seem to pose a real-world problem for
mirrors.

Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.centos.org/pipermail/centos-mirror/attachments/20100123/3a84311e/attachment-0004.sig>