Hello:
I have a machine running CentOS 5 x86_64.
It is running apache httpd and tomcat.
For some reason, after running for a few days, web requests stop responding. It happened again this morning. I check the syslog and see a HUGE number of logs like this:
OUTPUT IN= OUT=eth0 SRC=[MyIP] DST=[OutsideIP] LEN=532 TOS=0x00 PREC=0x00 TTL=64 ID=52669 DF PROTO=TCP SPT=80 DPT=54697 WINDOW=61 RES=0x00 ACK PSH FIN URGP=0
Here are my iptables commands for http connections (I have the default policy set to drop):
# Allow http connections from the outside world /sbin/iptables -A INPUT -i eth0 -d $ETH0_IP -p tcp --sport 1024: --dport http -m state --state NEW,ESTABLISHED -j ACCEPT /sbin/iptables -A OUTPUT -o eth0 -s $ETH0_IP -p tcp --sport http --dport 1024: -m state --state ESTABLISHED -j ACCEPT
Here are some strange things: 1. I have the exact same rules running on two other servers which do not give me any trouble. 2. If I stop and restart httpd and tomcat, the problem goes away. This suggests the firewall is not a problem.
Any ideas what is going on?
Thanks, Neil
-- Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com Eliminate junk email and reclaim your inbox. Visit http://www.spammilter.com for details.
Hi,
On Thu, Nov 6, 2008 at 09:33, Neil Aggarwal neil@jammconsulting.com wrote:
# Allow http connections from the outside world /sbin/iptables -A INPUT -i eth0 -d $ETH0_IP -p tcp --sport 1024: --dport http -m state --state NEW,ESTABLISHED -j ACCEPT /sbin/iptables -A OUTPUT -o eth0 -s $ETH0_IP -p tcp --sport http --dport 1024: -m state --state ESTABLISHED -j ACCEPT
Any ideas what is going on?
If you're using ESTABLISHED, it depends on ip_conntrack being able to track the connections. ip_conntrack keeps a table of all connections, but this table is limited in size, so it may be overflowing.
You can see how many entries you have in that table at any moment with this command: # cat /proc/sys/net/ipv4/netfilter/ip_conntrack_count
And you can see what the maximum is set with this command: # cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max
The default in CentOS 5 is 16k connections.
IIRC, you can increase that dinamically with echo ... >/proc/sys/... or with sysctl. Also, I believe you can set the default after a reboot in /etc/sysctl.conf. I think it's also possible to do that in /etc/modprobe.conf, but I'm not sure what the syntax is anymore, and modinfo ip_conntrack didn't give me any clues. Google should help with that.
HTH, Filipe
Filipe:
Thanks for the information.
If I do: cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max on each of my servers, they all report 65536 which seems like a pretty high limit.
If I do: cat /proc/sys/net/ipv4/netfilter/ip_conntrack_count on each of my servers, the highest number is just over 1100.
If this is the source of the problem, how would restarting httpd and tomcat help? I did not restart the machine nor reset iptables.
I am not asking this to be argumentative, just trying to understand how the facts I am seeing are related.
Thanks, Neil
-- Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com Eliminate junk email and reclaim your inbox. Visit http://www.spammilter.com for details.
If you're using ESTABLISHED, it depends on ip_conntrack being able to track the connections. ip_conntrack keeps a table of all connections, but this table is limited in size, so it may be overflowing.
Hi,
On Thu, Nov 6, 2008 at 10:42, Neil Aggarwal neil@jammconsulting.com wrote:
If this is the source of the problem, how would restarting httpd and tomcat help? I did not restart the machine nor reset iptables.
Because this might potentially close several connections and free slots in the conntrack table.
You are right that your conntrack table size is high enough and this should not be happening. It might be an attack, a synflood or something, that is causing this problem to happen. In that case, the semi-opened connections will be kept on the table, but as the other side will not complete the handshake, they will only be removed from the table after a timeout. I also think that when you stop Apache, there will be no process listening on port 80 anymore, and then conntrack may get rid of those semi-opened connections since the other side is not listening anymore. A lot of especulation here, but it might be what is affecting you.
In any case, next time you have this same problem, considering looking at the counters to see if _count is reaching _max, that would confirm the hypothesis.
If that is indeed the case, you can dump the conntrack information with this command: # cat /proc/net/ip_conntrack
You can do that and save it to another file, restart Apache and do the same, so that you can see what is really happening there. This might give you a better idea of why it's happening.
If conntrack is really overflowing, you may consider increasing the table size, but this will mean more memory usage on your server.
Alternatively you might choose to redo your firewall rules to be stateless, by removing --state NEW and --state ESTABLISHED, and by adding ! --syn on the ones you want to allow for established connections only. It's not going to be as perfect as actually tracking the connections, but for protocols like HTTP is a good enough compromise.
But your problem is probably being caused by something else, like an attack, so probably the best way to deal with it is to find out what is causing it and try to take measures to correct that problem instead.
I am not asking this to be argumentative, just trying to understand how the facts I am seeing are related.
No problem! Didn't sound argumentative to me in any way.
Let us know how that goes, and if you get more cluse, let us know if you need more help in fixing the root problem.
HTH, Filipe
Filipe:
I changed the firewall rules on the server that had stopped responding to not use ESTABLISHED.
Now, one of the servers that was still using ESTABLISHED stopped responding.
I am seeing logs like this in the syslog:
OUTPUT IN= OUT=eth0 SRC=[myIP] DST=[otherIP] LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=35076 DF PROTO=TCP SPT=80 DPT=36953 WINDOW=54 RES=0x00 ACK PSH FIN URGP=0
I did: cat /proc/sys/net/ipv4/netfilter/ip_conntrack_count and it gave me: 615
That seems like the conntrack is not overflowing, but the firewall was blocking the outbound traffic.
I updated all my servers to not use ESTABLISHED, but I am still baffled on how this could occur.
Any other ideas?
Thanks, Neil
-- Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com Eliminate junk email and reclaim your inbox. Visit http://www.spammilter.com for details.
You are right that your conntrack table size is high enough and this should not be happening. It might be an attack, a synflood or something, that is causing this problem to happen. In that case, the semi-opened connections will be kept on the table, but as the other side will not complete the handshake, they will only be removed from the table after a timeout. I also think that when you stop Apache, there will be no process listening on port 80 anymore, and then conntrack may get rid of those semi-opened connections since the other side is not listening anymore. A lot of especulation here, but it might be what is affecting you.
In any case, next time you have this same problem, considering looking at the counters to see if _count is reaching _max, that would confirm the hypothesis.
Filipe:
One of my servers stopped responding again. This time, it was one of those which was not using ESTABLISHED.
I am now convinced the problem is not in the firewall. It must be somewhere in Apache, Tomcat, or my application code (Most likely). I think I was seeing the firewall logs after I restarted Apache since the responses were rejected since they no longer were attached to an established connection.
Sorry for the red herring.
Neil
-- Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com Eliminate junk email and reclaim your inbox. Visit http://www.spammilter.com for details.
I am seeing logs like this in the syslog:
OUTPUT IN= OUT=eth0 SRC=[myIP] DST=[otherIP] LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=35076 DF PROTO=TCP SPT=80 DPT=36953 WINDOW=54 RES=0x00 ACK PSH FIN URGP=0
Hi,
On Wed, Nov 12, 2008 at 12:44, Neil Aggarwal neil@jammconsulting.com wrote:
Sorry for the red herring.
No problem.
I am now convinced the problem is not in the firewall. It must be somewhere in Apache, Tomcat, or my application code (Most likely). I think I was seeing the firewall logs after I restarted Apache since the responses were rejected since they no longer were attached to an established connection.
Look into the number of busy httpd servers, that might be your problem (and why it's not accepting any new connections). To do that you can use a configured URL in Apache (I believe it is /server-status) or you can at least estimate using "ps" and comparing with the settings for maximum number of servers in your httpd.conf.
HTH, Filipe
Neil Aggarwal wrote on Thu, 6 Nov 2008 08:33:59 -0600:
/sbin/iptables -A OUTPUT -o eth0 -s $ETH0_IP -p tcp --sport http --dport 1024: -m state --state ESTABLISHED -j ACCEPT
Why do you try to filter outbound connections at all? If "something" makes it on your machine the first thing they will do is drop your rules.
Kai
Why do you try to filter outbound connections at all? If "something" makes it on your machine the first thing they will do is drop your rules.
You imply the *only* reason for outbound filtering is stop a hacker. In some environments it serves as an additional layer of protection against other problems related to configuration/application issues as an example.
jlc