All,
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just 12 IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
In our research before deciding to offer our support we were told to expect a sustained 3-5 Mbit/s of mirror traffic. In reality, and from only a handful of IPs, we regularly push over 200Mbit/s on our 300Mbit/s line. Each of the abusive IPs downloads the same DVD iso files over and over thousands of times. We have tried blocking the abusive IPs only to see another IP with a sequentially increased last octet take its place. Whether this is an outright attack or just an unfortunate coincidence matters not.
Regretfully, I must ask that we be delisted from the mirror list asap. Once our links are down, we will shut down the server.
At some point in the future we may decide to participate again but for now, we cannot justify the inordinate bandwidth use.
Scott Adametz
Systems Engineer
Big Ten Network
600 W Chicago Ave.
Chicago, IL 60654
scott.adametz@bigtennetwork.com mailto:scott.adametz@bigtennetwork.com
O:312.665.0787 C:708.214.4232
Dear Scott & members of list.
I can confirm that this problem also is suffering in the administrative system which excessive traffic from IP's from China, but also various attempts to attack on the servers, -DoS, Hijacking ...- and is a scourge that is not because long can bear, and that unfortunately requires us to rethink continuing the participation of the draft CentOS mirrors. Our work has increased by 50% as long as all references to reinforce safety issues, settings ..., and yet we have experienced two problems with email accounts or passwords Identity theft hacked.
At the moment we, but ... I can not say for how long.
Regards
Jose A. Crespo System Administrator
--------------------------------------------------------------------------------
info@mail.idl3.net · http://www.idl3.net
--------------------------------------------------------------------------------
----- Original Message ----- From: Scott Adametz To: centos-mirror@centos.org Sent: Thursday, January 21, 2010 9:06 PM Subject: [CentOS-mirror] Please remove from mirror list: mirrors.bigtennetwork.com/CentOS
All,
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just 12 IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
In our research before deciding to offer our support we were told to expect a sustained 3-5 Mbit/s of mirror traffic. In reality, and from only a handful of IPs, we regularly push over 200Mbit/s on our 300Mbit/s line. Each of the abusive IPs downloads the same DVD iso files over and over thousands of times. We have tried blocking the abusive IPs only to see another IP with a sequentially increased last octet take its place. Whether this is an outright attack or just an unfortunate coincidence matters not.
Regretfully, I must ask that we be delisted from the mirror list asap. Once our links are down, we will shut down the server.
At some point in the future we may decide to participate again but for now, we cannot justify the inordinate bandwidth use.
Scott Adametz
Am 21.01.10 21:32, schrieb Administrador:
Dear Scott & members of list.
I can confirm that this problem also is suffering in the administrative system which excessive traffic from IP's from China, but also various attempts to attack on the servers, -DoS, Hijacking ...- and is a scourge that is not because long can bear, and that unfortunately requires us to rethink continuing the participation of the draft CentOS mirrors. Our work has increased by 50% as long as all references to reinforce safety issues, settings ..., and yet we have experienced two problems with email accounts or passwords Identity theft hacked.
Okay, apart from the mail account and password identity theft (which really cannot be a problem with you mirroring CentOS): Do you want to stay on the mirrorlist, but are reconsidering or do you want to be taken off the list because of you getting too much traffic?
Regards,
Ralph
Dear Ralph.
At the moment we as mirror, if that decision is taken (before it will exhaust all possible options) will be officially communicated. Therefore, we continue like mirror.
Regards
José A. Crespo
----- Original Message ----- From: "Ralph Angenendt" ralph.angenendt@gmail.com To: centos-mirror@centos.org Sent: Thursday, January 21, 2010 10:43 PM Subject: Re: [CentOS-mirror] Please remove from mirror list: mirrors.bigtennetwork.com/CentOS
Okay, apart from the mail account and password identity theft (which really cannot be a problem with you mirroring CentOS): Do you want to stay on the mirrorlist, but are reconsidering or do you want to be taken off the list because of you getting too much traffic?
We've seen the same chinese IPs downloading the DVD iso's 24/7 with like 10-20 concurrent connections, sometimes using upwards of 300Mbit/sec. We dropped supporting the DVD downloads and implemented systems to ban such malicious activity. There's no way this is a 'download accelerator'. Someone or some ISP is trying to generate traffic (for whatever reason, maybe to meet peering quotas), and using our DVD mirrors as an easy method.
-- Randy www.FastServ.com
---------- Original Message ----------- From: "Administrador" admin@mail.idl3.net To: "Mailing list for CentOS mirrors." centos-mirror@centos.org Sent: Fri, 22 Jan 2010 04:04:09 +0100 Subject: Re: [CentOS-mirror] Please remove from mirror list: mirrors.bigtennetwork.com/CentOS
Dear Ralph.
At the moment we as mirror, if that decision is taken (before it will exhaust all possible options) will be officially communicated. Therefore, we continue like mirror.
Regards
José A. Crespo
----- Original Message ----- From: "Ralph Angenendt" ralph.angenendt@gmail.com To: centos-mirror@centos.org Sent: Thursday, January 21, 2010 10:43 PM Subject: Re: [CentOS-mirror] Please remove from mirror list: mirrors.bigtennetwork.com/CentOS
Okay, apart from the mail account and password identity theft (which really cannot be a problem with you mirroring CentOS): Do you want to stay on the mirrorlist, but are reconsidering or do you want to be taken off the list because of you getting too much traffic?
CentOS-mirror mailing list CentOS-mirror@centos.org http://lists.centos.org/mailman/listinfo/centos-mirror
------- End of Original Message -------
If I were you, I'd drop the whole class C like 192.168.0.0/24 Should drop the traffic no problem, They can't have to many addresses. Even if you drop a handfull of class C's
On 1/21/2010 3:06 PM, Scott Adametz wrote:
All,
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just *12* IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
In our research before deciding to offer our support we were told to expect a sustained 3-5 Mbit/s of mirror traffic. In reality, and from only a handful of IPs, we regularly push over 200Mbit/s on our 300Mbit/s line. Each of the abusive IPs downloads the same DVD iso files over and over thousands of times. We have tried blocking the abusive IPs only to see another IP with a sequentially increased last octet take its place. Whether this is an outright attack or just an unfortunate coincidence matters not.
Regretfully, I must ask that we be delisted from the mirror list asap. Once our links are down, we will shut down the server.
At some point in the future we may decide to participate again but for now, we cannot justify the inordinate bandwidth use.
*Scott Adametz*
/Systems Engineer/
Big Ten Network
600 W Chicago Ave.
Chicago, IL 60654
scott.adametz@bigtennetwork.com mailto:scott.adametz@bigtennetwork.com
O:312.665.0787 C:708.214.4232
CentOS-mirror mailing list CentOS-mirror@centos.org http://lists.centos.org/mailman/listinfo/centos-mirror
On 21/01/10 20:06, Scott Adametz wrote:
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just *12* IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
Do you have some details on what this traffic is ? as in details to the number of times they hit your mirror, and what url they were getting from there ?
On Thu, Jan 21, 2010 at 09:21:53PM +0000, Karanbir Singh wrote:
On 21/01/10 20:06, Scott Adametz wrote:
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just *12* IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
Do you have some details on what this traffic is ? as in details to the number of times they hit your mirror, and what url they were getting from there ?
I have the same "problem" on my mirror. But it is not CentOS related. I am getting multiple http connections (between 30 and 50) from one IP from China on many files I have on my mirror. From what I heard it is some kind of download accelerator.
Adrian
On 01/21/2010 11:15 PM, Adrian Reber wrote: ...
I have the same "problem" on my mirror. But it is not CentOS related. I am getting multiple http connections (between 30 and 50) from one IP from China on many files I have on my mirror. From what I heard it is some kind of download accelerator.
All these "downloads" stopped when I set up a redirect of the .iso downloads to use ftp instead.
Real downloaders don't have a problem with this as the browser or wget just continues downloading via ftp.
Unfortunately, this is against CentOS rules, so we are no longer on the mirror list for http access (although it still works - for non .iso files).
As I understand it is the reason for not accepting this redirect of .iso files is that some users can't do ftp downloads because of proxy settings.
If this restriction is lifted, please let me know.
Mogens
You can also limit multiple connections by using mod_limitipconn (Apache httpd, available at epel):
<IfModule mod_limitipconn.c> MaxConnPerIP 6 </IfModule>
kind regards,
timm
Am 22.01.2010 08:56, schrieb Mogens Kjaer:
On 01/21/2010 11:15 PM, Adrian Reber wrote: ...
I have the same "problem" on my mirror. But it is not CentOS related. I am getting multiple http connections (between 30 and 50) from one IP from China on many files I have on my mirror. From what I heard it is some kind of download accelerator.
All these "downloads" stopped when I set up a redirect of the .iso downloads to use ftp instead.
Real downloaders don't have a problem with this as the browser or wget just continues downloading via ftp.
Unfortunately, this is against CentOS rules, so we are no longer on the mirror list for http access (although it still works - for non .iso files).
As I understand it is the reason for not accepting this redirect of .iso files is that some users can't do ftp downloads because of proxy settings.
If this restriction is lifted, please let me know.
Mogens
On 01/22/2010 08:08 AM, Timm Stamer wrote:
You can also limit multiple connections by using mod_limitipconn (Apache httpd, available at epel):
<IfModule mod_limitipconn.c> MaxConnPerIP 6 </IfModule>
I dont think thats a good idea at all, you are effectively cutting out all the people who must run behing a proxy or a NAT setup ( most offices and small - medium business's are like that ).
As I said already, if there was something smarter that could also consider filepath in some way, that might be a better options - but then again that would not cover the situation where repomd.xml and the repodata/* files are requested by multiple clients.
I do indeed have stats @ http://stats.btnchicago.com/usage_201001.html
Most of the traffic came from Chinese addresses in the 114.249.219.0, 121.41.181.0, 221.0.0.0, 123.118.107.0 and 218.1.7.200.0 subnets. According to GeoIP these originate from Beijing, Zhejiang, Hubei, Beijing and Fujian respectively. Each downloaded approximately 23TB, 17TB, 10TB, 10TB, 10TB and exhibited similar repetitive patterns of the same file.
The stats analysis is via webalizer (however I also ran analog on the server logs and it confirms the amount of traffic is correct) On the 15th I began blocking several of the offending /8 subnets (drastic, I know) while we determined what to do next hence the deep drop off in hits. On the 16th I removed the blocks and tried qdisc throttling and bandwidth management tools via Apache which as you can see only moderately control the rate, not the requests. It isn’t so much the traffic that we're worried about now as much as the attention being drawn to our network. The upper management at FOX is not interested in being a target hence the request to terminate the mirror (which has been shut down and will remain down).
We are extremely grateful for all the time and effort put forth by all the CentOS volunteers and offer our sincerest apologies for backing out of the mirror hosting. It was a decision not made lightly.
Thank you all for the insight and assistance, We hope to find a way to give back to the CentOS project in some other way in the near future.
Cheers,
Scott Adametz Systems Engineer Big Ten Network 600 W Chicago Ave. Chicago, IL 60654 scott.adametz@bigtennetwork.com O:312.665.0787 C:708.214.4232
-----Original Message----- From: centos-mirror-bounces@centos.org [mailto:centos-mirror-bounces@centos.org] On Behalf Of Karanbir Singh Sent: Thursday, January 21, 2010 3:22 PM To: Mailing list for CentOS mirrors. Subject: Re: [CentOS-mirror] Please remove from mirrorlist: mirrors.bigtennetwork.com/CentOS
On 21/01/10 20:06, Scott Adametz wrote:
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just *12* IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
Do you have some details on what this traffic is ? as in details to the number of times they hit your mirror, and what url they were getting from there ?
On Thu, Jan 21, 2010 at 02:16:44PM -0800, Scott Adametz wrote:
I do indeed have stats @ http://stats.btnchicago.com/usage_201001.html
Most of the traffic came from Chinese addresses in the 114.249.219.0, 121.41.181.0, 221.0.0.0, 123.118.107.0 and 218.1.7.200.0 subnets. According to GeoIP these originate from Beijing, Zhejiang, Hubei, Beijing and Fujian respectively. Each downloaded approximately 23TB, 17TB, 10TB, 10TB, 10TB and exhibited similar repetitive patterns of the same file.
If you have not configured mod_logio to log the actual data transmitted, those numbers are probably wrong. The numbers are the actual file size and not the number of bytes transmitted with a range request. At least thats how it was on my mirror.
I am logging both values and the one is 71TB while the number of actually transmitted traffic is only around 3TB per day.
Adrian
The stats analysis is via webalizer (however I also ran analog on the server logs and it confirms the amount of traffic is correct) On the 15th I began blocking several of the offending /8 subnets (drastic, I know) while we determined what to do next hence the deep drop off in hits. On the 16th I removed the blocks and tried qdisc throttling and bandwidth management tools via Apache which as you can see only moderately control the rate, not the requests. It isn’t so much the traffic that we're worried about now as much as the attention being drawn to our network. The upper management at FOX is not interested in being a target hence the request to terminate the mirror (which has been shut down and will remain down).
We are extremely grateful for all the time and effort put forth by all the CentOS volunteers and offer our sincerest apologies for backing out of the mirror hosting. It was a decision not made lightly.
Thank you all for the insight and assistance, We hope to find a way to give back to the CentOS project in some other way in the near future.
Cheers,
Scott Adametz Systems Engineer Big Ten Network 600 W Chicago Ave. Chicago, IL 60654 scott.adametz@bigtennetwork.com O:312.665.0787 C:708.214.4232
-----Original Message----- From: centos-mirror-bounces@centos.org [mailto:centos-mirror-bounces@centos.org] On Behalf Of Karanbir Singh Sent: Thursday, January 21, 2010 3:22 PM To: Mailing list for CentOS mirrors. Subject: Re: [CentOS-mirror] Please remove from mirrorlist: mirrors.bigtennetwork.com/CentOS
On 21/01/10 20:06, Scott Adametz wrote:
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just *12* IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
Do you have some details on what this traffic is ? as in details to the number of times they hit your mirror, and what url they were getting from there ?
-- Karanbir Singh kbsingh@karan.org | http://www.karan.org/ | twitter.com/kbsingh ICQ: 2522219 | Yahoo IM: z00dax | Gtalk: z00dax GnuPG Key : http://www.karan.org/publickey.asc
On 01/21/2010 10:24 PM, Adrian Reber wrote:
If you have not configured mod_logio to log the actual data transmitted, those numbers are probably wrong.
Adrian, that looks almost certainly the correct situation. I dont think we have seen the sort of mass mirror.centos.org traffic uptake that woudl happen if 300k new people were to do atleast 1 install each ( going by only the number of people requesting the DVD iso from Scott's mirror.
A more realistic situation would be that 100 people have done the download using an accelerator, resulting in many RANGE and Partial GET's to the httpd process.
On Thu, 21 Jan 2010, Scott Adametz wrote:
Most of the traffic came from Chinese addresses in the 114.249.219.0, 121.41.181.0, 221.0.0.0, 123.118.107.0 and 218.1.7.200.0 subnets. According to GeoIP these originate from Beijing, Zhejiang, Hubei, Beijing and Fujian respectively. Each downloaded approximately 23TB, 17TB, 10TB, 10TB, 10TB and exhibited similar repetitive patterns of the same file.
We had a similar issue at the centos (and other stuff) mirror at ftp.iitm.ac.in some months ago. We have solved it effectively using per ip connection limit and fail2ban. It appears that the traffic originates via a download accerlator that is popular in china. We used to get the similar thousands of ranged requests for the iso image files of centos and other linux distributions. We have put a per-ip connection limit of 5 using the limitipconn module. Connection attempts over 5 get logged in the apache error log. fail2ban package is used to monitor this log file; when any single ip generates more than 5 error message in a minute (meaning that ip has tried to open more than 5 connections more than 5 times in a minute), the fail2ban package inserts an iptables firewall rule that blocks ALL connection requests from this IP for the next one hour. After a few minutes, the 5 existing (ranged download request) connections complete their download and the offending IP is locked out for the rest of the hour. Works very very effectively. We saw our hit rate drop from about 700,000 per day to below 100,000 per day. We continue to server the centos (and other) mirror community. Scott, I would urge you to seriously consider this type of solution instead of dropping out of the mirror network. I will be happy to provide any further assistance in this regard.
On 01/22/2010 08:43 AM, Prof. P. Sriram wrote:
We had a similar issue at the centos (and other stuff) mirror at ftp.iitm.ac.in some months ago. We have solved it effectively using per ip connection limit and fail2ban.
The problem with this is that you have efectively made your mirror non usable for office's and orgaisations that only have 1 ip address to the world. There are quite a few of them.
This sort of a pricess would work better if it was to check and only work against an ip of its the same filename being requested rather than overall connections.
On Fri, 22 Jan 2010, Karanbir Singh wrote:
On 01/22/2010 08:43 AM, Prof. P. Sriram wrote:
We had a similar issue at the centos (and other stuff) mirror at ftp.iitm.ac.in some months ago. We have solved it effectively using per ip connection limit and fail2ban.
The problem with this is that you have efectively made your mirror non usable for office's and orgaisations that only have 1 ip address to the world. There are quite a few of them.
I believe a correction might be in order - we have made it non-usable for those that have 1 ip address and want to download at a rate exceeding 5 active connections per minute. Do you know of any such organizations? Shouldn't they be enhancing their connectivity?
This sort of a pricess would work better if it was to check and only work against an ip of its the same filename being requested rather than overall connections.
If you know of any package that provides this enhanced functionality, I would be happy to implement that instead of our current scheme.
On 01/22/2010 12:11 PM, Prof. P. Sriram wrote:
I believe a correction might be in order - we have made it non-usable for those that have 1 ip address and want to download at a rate exceeding 5 active connections per minute. Do you know of any such organizations?
yes, lots! including almost every office environment in the SME setup. Many people run development and testing VM's / machines inside their offices - and almost all have a small set of adsl links coming in ( in EU and US atleast ), that they use for all outbound internet connectivity behind a NAT setup. In many cases, yum-cron like jobs will kickoff at very similar times across an organisation.
Shouldn't they be enhancing their connectivity?
an example - adsl2+ brings in approx 16Mbps downstram, thats plenty of connectivity for most offices with <= 50 employes who mostly only do :80/:443 sort of traffic, with some other things like :22 and maybe rsync. They should perhaps consider setting up local repo's within their facility, but many lack the resources to do so.
If you know of any package that provides this enhanced functionality, I would be happy to implement that instead of our current scheme.
I personally dont. But if its a case of watching a log file, should not be hard to implement. However, the problem of things like repomd.xml etc still persists.
How about turning off 'RANGE' requests in httpd ? is that an option.
On Fri, 22 Jan 2010, Karanbir Singh wrote:
an example - adsl2+ brings in approx 16Mbps downstram, thats plenty of connectivity for most offices with <= 50 employes who mostly only do :80/:443 sort of traffic, with some other things like :22 and maybe rsync. They should perhaps consider setting up local repo's within their facility, but many lack the resources to do so.
I think this is getting well offtopic, but what the heck.
Is it 'reasonable' for such an organization to be generating more than 5 active connections to a single upstream mirror? And that too after receiving a 503 service unavailable message? That is what it will take to get on the netblock list for an hour. You may disagree, but I think this is a reasonable restriction to keep the server available and protected from (ab)users.
How about turning off 'RANGE' requests in httpd ? is that an option.
Maybe it was a version thing, but the url rewriting did not work on the server in question.
On 01/22/2010 02:19 PM, Prof. P. Sriram wrote:
an example - adsl2+ brings in approx 16Mbps downstram, thats plenty of connectivity for most offices with<= 50 employes who mostly only do
Is it 'reasonable' for such an organization to be generating more than 5 active connections to a single upstream mirror? And that too after
If there are dozens of computers behind that nat ip, then yes - its quite expected for them to generate more than a few connections per minute.
receiving a 503 service unavailable message? That is what it will take to get on the netblock list for an hour. You may disagree, but I think this is a reasonable restriction to keep the server available and protected from (ab)users.
on a 503, yum will fall back to the next mirror in the mirrorlist. However, it wont stop it from attempting a connection - and your machine will keep them on the blacklist
How about turning off 'RANGE' requests in httpd ? is that an option.
Maybe it was a version thing, but the url rewriting did not work on the server in question.
byte range partial gets are a http 1.1 thing arnt they ? If you want to stick with 1.1, you can still disable them with unset header, and remove that from the request completely.
iirc, kernel.org and heanet.ie both have partial gets disabled, wonder if they will share some info on how they are doing this and what their recommended solution to this sort of heavy hitrate from small number of ip's is.
- KB
On Fri, Jan 22, 2010 at 12:23:07 +0000, Karanbir Singh wrote:
On 01/22/2010 12:11 PM, Prof. P. Sriram wrote:
I believe a correction might be in order - we have made it non-usable for those that have 1 ip address and want to download at a rate exceeding 5 active connections per minute. Do you know of any such organizations?
yes, lots! including almost every office environment in the SME setup. Many people run development and testing VM's / machines inside their offices - and almost all have a small set of adsl links coming in ( in EU and US atleast ), that they use for all outbound internet connectivity behind a NAT setup. In many cases, yum-cron like jobs will kickoff at very similar times across an organisation.
I agree that 5 connections per IP is a little slim. However, given you apply such limits to the files where it matters (large files), it's certainly okay to expect from such companies or regional setups to set up a mirror for them, in their own best interest.
In my findings, it wasn't necessary though to go lower than 20 with this kind of restriction, and I'm quite sure that this leaves enough headroom for the type of setups that you mention.
If it really poses a problem, it would be trivial to enhance the Apache module to look not only at IP + number of connections, but to key this to URL+User-Agent.
Yes, I'm postulating that these excessive parallel connections are _not_ the result of some evil mind, but purely the result of misinformation on the side of users (naively tweaking the button that is "supposed" to make it faster (and it works to some extent for the)). In fact, there'll be some good amount of desperation involved that causes people to try out these kind of extreme settings. Nobody in the better-connected world would ever see the need to do so. But Chinese users need to go through needle eyes...
(I have virtually never seen (and virtually never heard) of deliberate DoS attacks against open source mirrors; most issues seem to be misconfiguration or broken software; anyone else?)
Shouldn't they be enhancing their connectivity?
an example - adsl2+ brings in approx 16Mbps downstram, thats plenty of connectivity for most offices with <= 50 employes who mostly only do :80/:443 sort of traffic, with some other things like :22 and maybe rsync. They should perhaps consider setting up local repo's within their facility, but many lack the resources to do so.
If you know of any package that provides this enhanced functionality, I would be happy to implement that instead of our current scheme.
I personally dont. But if its a case of watching a log file, should not be hard to implement. However, the problem of things like repomd.xml etc still persists.
I doubt that the negative effects of massive parallel connections occur with repomd.xml files -- so there wouldn't be a need to impose the limitation on them in the first place, or what do you think?
How about turning off 'RANGE' requests in httpd ? is that an option.
I recommend against it, because even though Range requests are not mandatory in HTTP/1.1 to be supported by servers, they are so universally supported (in default configurations) that there is certain expectation on the client side that they _will_ suppor it. If a server doesn't, it can lead to ugly surprises on the client side (pulling gigabytes of data instead of a small chunk into memory), and also the server might end up delivering more data than it would otherwise.
Some people have made claims that range requests have a negative effect on buffer caches (and proposed to switch them off for that reason0, but from what I see it doesn't seem to pose a real-world problem for mirrors.
Peter
We only IPlimit .iso files and that solved the problem. Either way, even if you are sending 500's the yum clients at the NAT site will fail over to another mirror. What's the big deal??
-- Randy www.FastServ.com
---------- Original Message ----------- From: Peter Poeml poeml@cmdline.net To: centos-mirror@centos.org Sent: Sat, 23 Jan 2010 01:12:21 +0100 Subject: Re: [CentOS-mirror] Chinese IPs - Mirror Stats
On Fri, Jan 22, 2010 at 12:23:07 +0000, Karanbir Singh wrote:
On 01/22/2010 12:11 PM, Prof. P. Sriram wrote:
I believe a correction might be in order - we have made it non-usable for those that have 1 ip address and want to download at a rate exceeding 5 active connections per minute. Do you know of any such organizations?
yes, lots! including almost every office environment in the SME setup. Many people run development and testing VM's / machines inside their offices - and almost all have a small set of adsl links coming in ( in EU and US atleast ), that they use for all outbound internet connectivity behind a NAT setup. In many cases, yum-cron like jobs will kickoff at very similar times across an organisation.
I agree that 5 connections per IP is a little slim. However, given you apply such limits to the files where it matters (large files), it's certainly okay to expect from such companies or regional setups to set up a mirror for them, in their own best interest.
In my findings, it wasn't necessary though to go lower than 20 with this kind of restriction, and I'm quite sure that this leaves enough headroom for the type of setups that you mention.
If it really poses a problem, it would be trivial to enhance the Apache module to look not only at IP + number of connections, but to key this to URL+User-Agent.
Yes, I'm postulating that these excessive parallel connections are _not_ the result of some evil mind, but purely the result of misinformation on the side of users (naively tweaking the button that is "supposed" to make it faster (and it works to some extent for the)). In fact, there'll be some good amount of desperation involved that causes people to try out these kind of extreme settings. Nobody in the better-connected world would ever see the need to do so. But Chinese users need to go through needle eyes...
(I have virtually never seen (and virtually never heard) of deliberate DoS attacks against open source mirrors; most issues seem to be misconfiguration or broken software; anyone else?)
Shouldn't they be enhancing their connectivity?
an example - adsl2+ brings in approx 16Mbps downstram, thats plenty of connectivity for most offices with <= 50 employes who mostly only do :80/:443 sort of traffic, with some other things like :22 and maybe rsync. They should perhaps consider setting up local repo's within their facility, but many lack the resources to do so.
If you know of any package that provides this enhanced functionality, I would be happy to implement that instead of our current scheme.
I personally dont. But if its a case of watching a log file, should not be hard to implement. However, the problem of things like repomd.xml etc still persists.
I doubt that the negative effects of massive parallel connections occur with repomd.xml files -- so there wouldn't be a need to impose the limitation on them in the first place, or what do you think?
How about turning off 'RANGE' requests in httpd ? is that an option.
I recommend against it, because even though Range requests are not mandatory in HTTP/1.1 to be supported by servers, they are so universally supported (in default configurations) that there is certain expectation on the client side that they _will_ suppor it. If a server doesn't, it can lead to ugly surprises on the client side (pulling gigabytes of data instead of a small chunk into memory), and also the server might end up delivering more data than it would otherwise.
Some people have made claims that range requests have a negative effect on buffer caches (and proposed to switch them off for that reason0, but from what I see it doesn't seem to pose a real-world problem for mirrors.
Peter
------- End of Original Message -------
On 01/23/2010 02:11 AM, Randy McAnally wrote:
We only IPlimit .iso files and that solved the problem. Either way, even if you are sending 500's the yum clients at the NAT site will fail over to another mirror. What's the big deal??
doing this only for .iso files is fine. but doing it for all files is not. the 'big deal' is that if all the mirrors were doing this form of rate limiting for all files on their servers, its academic for yum to fall over to the next mirror, since that one will be blocking access as well.
besides, lets not forget that yum can itself download, even from 1 machine, more than 5 packages in a minute. So doing a block across all files for 5/min is not a good idea.
- KB
Hello,
the initial synchronization was successful.
URL: http://78.46.104.194:8080/ Synchronization runs at: 8:00 o'clock, 14:00 o'clock and 23:00 o'clock GMT+1 Server location is Nürnberg / Bavaria / Germany Bandwith: 100MBit/s Switchport, 99GBit/s Computing Center Sponsor: http://www.foxyfighters.de Foxyfighters Clan
Kind regards
Patrick
-----Ursprüngliche Nachricht----- Von: centos-mirror-bounces@centos.org [mailto:centos-mirror-bounces@centos.org] Im Auftrag von Karanbir Singh Gesendet: Samstag, 23. Januar 2010 15:29 An: Mailing list for CentOS mirrors. Betreff: Re: [CentOS-mirror] Chinese IPs - Mirror Stats
On 01/23/2010 02:11 AM, Randy McAnally wrote:
We only IPlimit .iso files and that solved the problem. Either way, even if you are sending 500's the yum clients at the NAT site will fail over to another mirror. What's the big deal??
doing this only for .iso files is fine. but doing it for all files is not. the 'big deal' is that if all the mirrors were doing this form of rate limiting for all files on their servers, its academic for yum to fall over to the next mirror, since that one will be blocking access as well.
besides, lets not forget that yum can itself download, even from 1 machine, more than 5 packages in a minute. So doing a block across all files for 5/min is not a good idea.
- KB _______________________________________________ CentOS-mirror mailing list CentOS-mirror@centos.org http://lists.centos.org/mailman/listinfo/centos-mirror
Eingehende eMail ist virenfrei. Von AVG überprüft - www.avg.de Version: 9.0.730 / Virendatenbank: 271.1.1/2640 - Ausgabedatum: 01/23/10 08:33:00
Ausgehende eMail ist virenfrei. Von AVG überprüft - www.avg.de Version: 9.0.730 / Virendatenbank: 271.1.1/2640 - Ausgabedatum: 01/24/10 08:33:00
On Sun, Jan 24, 2010 at 11:17:28AM +0100, Patrick Wulff wrote:
Hello,
the initial synchronization was successful.
URL: http://78.46.104.194:8080/ Synchronization runs at: 8:00 o'clock, 14:00 o'clock and 23:00 o'clock GMT+1 Server location is Nürnberg / Bavaria / Germany Bandwith: 100MBit/s Switchport, 99GBit/s Computing Center Sponsor: http://www.foxyfighters.de Foxyfighters Clan
just added,
Best regards and thank you for your support.
Tru
We don't 500 based on hit rate, rather concurrent connections. I considered taking it further with IP banning but it's really not needed. 500'ing of the Nth concurrent connection does the trick. Once they drop one of their connections, they can make new ones again.
What I've noticed is that the concurrent yum connection is always very low, usually one or two concurrent at most. Yum clients have little to worry about, in fact even with global iplimit (not just ISO) no yum client was ever blocked. I limited it to iso files only 'just in case'.
-- Randy www.FastServ.com
---------- Original Message ----------- From: Karanbir Singh mail-lists@karan.org To: "Mailing list for CentOS mirrors." centos-mirror@centos.org Sent: Sat, 23 Jan 2010 14:29:27 +0000 Subject: Re: [CentOS-mirror] Chinese IPs - Mirror Stats
On 01/23/2010 02:11 AM, Randy McAnally wrote:
We only IPlimit .iso files and that solved the problem. Either way, even if you are sending 500's the yum clients at the NAT site will fail over to another mirror. What's the big deal??
doing this only for .iso files is fine. but doing it for all files is not. the 'big deal' is that if all the mirrors were doing this form of rate limiting for all files on their servers, its academic for yum to fall over to the next mirror, since that one will be blocking access as well.
besides, lets not forget that yum can itself download, even from 1 machine, more than 5 packages in a minute. So doing a block across all files for 5/min is not a good idea.
- KB
CentOS-mirror mailing list CentOS-mirror@centos.org http://lists.centos.org/mailman/listinfo/centos-mirror
------- End of Original Message -------
--On fredag, januari 22, 2010 17.41.17 +0530 "Prof. P. Sriram" sriram@ae.iitm.ac.in wrote:
On Fri, 22 Jan 2010, Karanbir Singh wrote:
On 01/22/2010 08:43 AM, Prof. P. Sriram wrote:
We had a similar issue at the centos (and other stuff) mirror at ftp.iitm.ac.in some months ago. We have solved it effectively using per ip connection limit and fail2ban.
The problem with this is that you have efectively made your mirror non usable for office's and orgaisations that only have 1 ip address to the world. There are quite a few of them.
I believe a correction might be in order - we have made it non-usable for those that have 1 ip address and want to download at a rate exceeding 5 active connections per minute. Do you know of any such organizations? Shouldn't they be enhancing their connectivity?
I'm not getting into the "right/or/wrong" aspects of this, as both of you have valid points.
I'm curious though as why you block them completely, instead of just have them put under some concurensy-limit.
As I understand it you are uinjecting rules to netfilter to have the abusing addresses blocked, so I think it sould be simple enough to put a limit on these addresses using the same injection mecanism. Or?
Regards, Emil
On Fri, 22 Jan 2010, Emil wrote:
I'm curious though as why you block them completely, instead of just have them put under some concurensy-limit.
The addresses are already under the concurrency limit as described in the original post. The netfilter kicks in when there is certain volume (requests per minute) EXCEEDING the concurrency limit. A human being exceeding the concurrency limit gets a HTTP 503 service unavailable message and will hopefully try again only after some time, when the concurrency limit is not being exceeded. Well, that is plan, anyway.
--On fredag, januari 22, 2010 18.55.11 +0530 "Prof. P. Sriram" sriram@ae.iitm.ac.in wrote:
On Fri, 22 Jan 2010, Emil wrote:
I'm curious though as why you block them completely, instead of just have them put under some concurensy-limit.
The addresses are already under the concurrency limit as described in the original post. The netfilter kicks in when there is certain volume (requests per minute) EXCEEDING the concurrency limit. A human being exceeding the concurrency limit gets a HTTP 503 service unavailable message and will hopefully try again only after some time, when the concurrency limit is not being exceeded. Well, that is plan, anyway.
Still, the concurrency limit is within apache, right? What I meant was to put an (aditional) limit in netfilter instead of a "complete" block.
Should you only block new connections when the "ban" kicks in it wonät be too bad, and teh effect for the "visitor" should be very similar to a more gentle limit based approach. If however you put a block based only on the ip address existing connections will fail to complete, which obviously will cause them to have a valid reason to start again as soon a the ban is lifted.
Anyway, thanks for the tip on fail2ban, I may put that to use in other places!
Regards, Emil
Hello,
how about to give iptables with hashlimit a try, i already used it with sucess to prevent from botnet ddos attacks against webservers. something like
iptables -t filter -A INETIN -p tcp --syn -s 0/0 --dport 80 -m hashlimit --hashlimit 25/s --hashlimit-burst 20 --hashlimit-mode srcip --hashlimit-name HTTP -j ACCEPT
iptables -t filter -A INETIN -p tcp --syn -s 0/0 --dport 80 -m limit --limit 1/s --limit-burst 5 -j LOG --log-level $LOG_LEVEL --log-prefix "[HTTP_DROPPED_NEW] : "
iptables -t filter -A INETIN -p tcp --syn -s 0/0 --dport 80 -j DROP
iptables -t filter -A INETIN -p tcp -s 0/0 --dport 80 -m state --state NEW -j ACCEPT
should fix it... of course set the hashlimit to parameters which your mirror can take.
Greetings
Juergen
Emil wrote:
--On fredag, januari 22, 2010 18.55.11 +0530 "Prof. P. Sriram" sriram@ae.iitm.ac.in wrote:
On Fri, 22 Jan 2010, Emil wrote:
I'm curious though as why you block them completely, instead of just have them put under some concurensy-limit.
The addresses are already under the concurrency limit as described in the original post. The netfilter kicks in when there is certain volume (requests per minute) EXCEEDING the concurrency limit. A human being exceeding the concurrency limit gets a HTTP 503 service unavailable message and will hopefully try again only after some time, when the concurrency limit is not being exceeded. Well, that is plan, anyway.
Still, the concurrency limit is within apache, right? What I meant was to put an (aditional) limit in netfilter instead of a "complete" block.
Should you only block new connections when the "ban" kicks in it wonät be too bad, and teh effect for the "visitor" should be very similar to a more gentle limit based approach. If however you put a block based only on the ip address existing connections will fail to complete, which obviously will cause them to have a valid reason to start again as soon a the ban is lifted.
Anyway, thanks for the tip on fail2ban, I may put that to use in other places!
Regards, Emil
CentOS-mirror mailing list CentOS-mirror@centos.org http://lists.centos.org/mailman/listinfo/centos-mirror
On 01/21/2010 10:16 PM, Scott Adametz wrote:
I do indeed have stats @ http://stats.btnchicago.com/usage_201001.html
I wonder if you have any switch/port side stats as well ?
Am 21.01.10 21:06, schrieb Scott Adametz:
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just *12* IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
I put your mirror down as "DISABLED", in case you want to reconsider :)
In our research before deciding to offer our support we were told to expect a sustained 3-5 Mbit/s of mirror traffic. In reality, and from only a handful of IPs, we regularly push over 200Mbit/s on our 300Mbit/s line. Each of the abusive IPs downloads the same DVD iso files over and over thousands of times. We have tried blocking the abusive IPs only to see another IP with a sequentially increased last octet take its place. Whether this is an outright attack or just an unfortunate coincidence matters not.
I'd like to keep the discussion up, though. Where does the traffic come from exactly? Can that be nailed down to a company or whatever?
Regretfully, I must ask that we be delisted from the mirror list asap. Once our links are down, we will shut down the server.
At some point in the future we may decide to participate again but for now, we cannot justify the inordinate bandwidth use.
Thank you for supporting CentOS and I'm sorry that this has happened (and seems to continue to happen).
No idea why that happened to you, the mirror I maintain seems fine, the general CentOS mirrors too - I guess we would have heard from sponsors.
I know we already had this discussion (or at least a similar one) before, do these people still have those problems?
Regards,
Ralph
Hi Scott,
On Thu, Jan 21, 2010 at 12:06:47 -0800, Scott Adametz wrote:
Due to an inordinate amount of Chinese based traffic from only a handful of IP addresses (over 145 TB transferred to just 12 IP addresses in only 15 days since we started hosting the mirror) we are forced to cease our participation in the CentOS mirror project.
This is an interesting report, because it doesn't fit the patterns that were seen in the past.
Of course, there could always something new, that we haven't seen yet -- but for the moment, I'll analyze your case on the common ground of what's known to me.
In our research before deciding to offer our support we were told to expect a sustained 3-5 Mbit/s of mirror traffic. In reality, and from only a handful of IPs, we regularly push over 200Mbit/s on our 300Mbit/s line. Each of the abusive IPs downloads the same DVD iso files over and over thousands of times. We have tried blocking the abusive IPs only to see another IP with a sequentially increased last octet take its place. Whether this is an outright attack or just an unfortunate coincidence matters not.
Let's check: with a connection that maxes out at 300 MBit/s, I calculate a maximum amount of data of 2.5 TB that can be delivered in 24 hours.
Within 15 days, you would not be able to deliver more than 40 TB, thus I think that the number of 145 TB that you report must be based on a miscalculation. It is impossible on my above calculation at least.
If you measured by looking at mod_status, or by analyzing Apache logs without using mod_logio, this is to be expected. The numbers that are logged there are grossly overestimating traffic because (as Adrian mentioned already) they don't log the effectively transferred amount, but numbers based on file size.
Regretfully, I must ask that we be delisted from the mirror list asap. Once our links are down, we will shut down the server.
At some point in the future we may decide to participate again but for now, we cannot justify the inordinate bandwidth use.
I guess that you already checked other means to assess actual network traffic, but if you didn't, I would recommend to check again in more detail.
(Your followup indicates that you used only webalizer and analog, i.e. purely Apache log analyzer that look only at the default log format, and that would lead to those skewed numbers.)
If you discover that the numbers were indeed wrong (I would be surprised if not :-) then the amount of data transferred might not be like a problem at all anymore. However, the number of connections opened by some clients might be the actual problem the you might want (and need) to fight against to protect your resources.
I have seen as much as 300 (!) parallel connections from single IP addresses, all downloading the single same file in range requests.
The fact that the connection from many Chinese network to the rest of the world is so very bad is the _reason_ why those parallel connections are opened, and they persist for long periods of time because the netto transfer is low.
You can address this issue by limiting parallel connections to your mirror by IP address, as was suggested in several followups to this thread. mod_limitipconn is the treatment of choice.
You'll have seen the number of connections being discussed in the thread as being a possible source of cutting off legitimate users. Indeed, I wouldn't go as low as 5 parallel connections, because with a number that low I see the risk as well. However, I can recommend "MaxConnPerIP 20" as a good value, which I have had no problems with whatsoever. This effectively limits the harm, as long as you don't encounter dozens of those kind of downloaders -- but I have never seen more than a small handful in practice, with the mirrors that I maintained.
Of course, you could apply the limitation just to the large files (*.iso), to further reduce the risk of impacting legitimate users. That can be done with on-board Apache config, and there's also a handy NoIPLimit directive to define exclusion rules.
The typical "download accelerator" will download content in chunks (with partial GET requests / range requests) and often also open parallel connections to steal a little bandwidth from other users in their own interest. 2 parallel connections are suggested as per the HTTP standard (the newer, to-come HTTPbis standard will remove the limit); 4-5 connections is a frequent default, and some users might (mis-)configure their clients to use excessive numbers. As mentioned above, I have seen as much as 300, and that certainly became a problem for the mirror I was maintaining.
Now there's a litle gotcha: The partial GET requests are logged correctly by Apache, even without mod_logio, as long as the the client doesn't prematurely terminate the connection. When Apache gets a Range request for bytes x to y, it'll deliver that range and log the correct number (in the default log file). However, the more frequently used type of request that typical download clients do is "Range: 12345-", i.e. they don't specify the end of the chunk they want, which means "till the end". However, they wait until they got just as much data as they want, and decide whether to stick to the connection (if it's fast), or to terminate it. Now if they terminate it, Apache will log a wrong number (likely the whole file size).
Here's an example:
% curl -o /dev/null http://doozer.poeml.de/opensuse-education/ISOs/openSUSE-Edu-li-f-e-11.2-1-i6... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 1 2929M 1 32.7M 0 0 47.5M 0 0:01:01 --:--:-- 0:01:01 47.6M^C
(Note how I terminated the client with Ctrl-C after a short while)
For this, Apache logs the following. (I'm breaking the line to make it more readable). Note thate there are two numbers appended to the common log format, which are those I added with mod_logio:
87.79.143.238 - - [23/Jan/2010:00:40:29 +0100] "GET /opensuse-education/ISOs/openSUSE-Edu-li-f-e-11.2-1-i686.iso HTTP/1.1" 200 3071279104 "-" "curl/7.19.7 (i386-apple-darwin9.8.0) libcurl/7.19.7 zlib/1.2.3" 189 10020960
As can be seen here, %b from mod_log_config is filled with the whole file size (3071279104, 300 MB in this case), because that's what Apache intended to send. However, the actually sent bytes were just 10 MB (last number).
For connections that run to the end, or for range requests that come _with_ an end of the range specified, and are not terminated prematurely, the number would actually be correct, which you can easily try out.
However, the above "special case" ruins your statistic. Thus, it's good to always use mod_logio, and/or use vnstat or other (e.g. external) means to keep an eye on network traffic.
(Don't trust mod_status in this regard, either - the numbers are even worse; even a HEAD request will cound the entire file size.)
I wish you that you find out that you didn't really experience those extreme amounts of traffic, as it might have looked at first! Hopefully, the above explains the things that happend with a much more welcome explanation.
Good Luck!
Peter