Last week, I started seeing very strange behavior in one of the networks that I manage.
The office LAN uses a Linux firewall which masquerades their workstations over their DSL connection. There are probably ~75 workstations in the office LAN.
Their mail server is in a collocated facility nearby. That server has an RFC1918 address; its router does SNAT to forward packets to the system.
Both the office firewall and the mail server are currently running fully patched CentOS 5.2.
Here's the weird part: If a machine running Linux in the office lan attempts to connect to the mail server on any TCP port, there's a small chance that the server will simply ignore the SYN packets. It doesn't log any errors. If I'm running tcpdump, I see the incoming SYN packets, but no reply. If I use iptables to log the packets, information about the packet is saved in the messages file. If I capture the packets and use wireshark to analyze them, I don't see anything odd: the checksums are good and I can't see any difference between a SYN packet that gets a SYN+ACK and one that's ignored (beyond the obvious: different timestamps and checksums). The problem doesn't seem to affect Windows workstations in the office LAN. As far as I can tell, only SYN packets are dropped. I don't see delays in established connections.
I've attached a file that contains, first, the output of tcpdump which shows packets to or from the office's firewall address, as recorded by the destination server. The first four SYN packets are ignored, but the kernel proceeds with the TCP handshake after the fifth SYN packet.
Second, the file contains the log messages which are recorded as a result of these iptables rules:
iptables -A INPUT -p tcp -s officefw --dport 22 -j LOG iptables -A INPUT -p tcp -s officefw --dport 22 -j ACCEPT
Those are the only iptables rules present on the server accepting the connections.
Both of those appear to indicate that the server in the colo facility is receiving the SYN packets. What possible reasons are there that it would not reply with SYN+ACK?
19:08:43.751579 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705398041 0,nop,wscale 7> 19:08:46.751136 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705401041 0,nop,wscale 7> 19:08:52.749305 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705407041 0,nop,wscale 7> 19:09:04.747287 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705419041 0,nop,wscale 7> 19:09:28.741854 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705443041 0,nop,wscale 7> 19:09:28.742540 IP remoteserver.ssh > officefw.57948: S 3324501337:3324501337(0) ack 3347102295 win 5792 <mss1460,sackOK,timestamp 15960089 2705443041,nop,wscale 7> 19:09:28.783886 IP officefw.57948 > remoteserver.ssh: . ack 1 win 46 <nop,nop,timestamp 2705443083 15960089> 19:09:28.789814 IP remoteserver.ssh > officefw.57948: P 1:21(20) ack 1 win 46 <nop,nop,timestamp 159601372705443083> 19:09:28.829114 IP officefw.57948 > remoteserver.ssh: . ack 21 win 46 <nop,nop,timestamp 2705443129 15960137>
Sep 4 19:08:43 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41807 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 Sep 4 19:08:46 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41808 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 Sep 4 19:08:52 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41809 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 Sep 4 19:09:04 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41810 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 Sep 4 19:09:28 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41811 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 Sep 4 19:09:28 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=52 TOS=0x00 PREC=0x00 TTL=56 ID=41812 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=46 RES=0x00 ACK URGP=0 Sep 4 19:09:28 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=52 TOS=0x00 PREC=0x00 TTL=56 ID=41813 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=46 RES=0x00 ACK URGP=0
Gordon Messmer wrote:
Both of those appear to indicate that the server in the colo facility is receiving the SYN packets. What possible reasons are there that it would not reply with SYN+ACK?
Maybe it does reply, just on a different interface? Is this a multi-homed system? Bonded interfaces?
Gordon Messmer wrote:
Florin Andrei wrote:
Maybe it does reply, just on a different interface? Is this a multi-homed system? Bonded interfaces?
There's only one interface with an IP address, and only one route back to the office.
If you were running tcpdump in promiscuous mode, re-run the tests with it non-promiscuous. Just to make sure the SYN is actually received by that system.
Florin Andrei wrote:
If you were running tcpdump in promiscuous mode, re-run the tests with it non-promiscuous. Just to make sure the SYN is actually received by that system.
I ran the test again with "tcpdump -i eth0 -p" and then thinking better of it, with "tcpdump -i any". In both cases I was able to replicate the problem. Even with "-i any" I see the incoming SYN packets and no SYN+ACK reply.
Gordon Messmer wrote:
Florin Andrei wrote:
If you were running tcpdump in promiscuous mode, re-run the tests with it non-promiscuous. Just to make sure the SYN is actually received by that system.
I ran the test again with "tcpdump -i eth0 -p" and then thinking better of it, with "tcpdump -i any". In both cases I was able to replicate the problem. Even with "-i any" I see the incoming SYN packets and no SYN+ACK reply.
And you do capture *all* traffic, right? Don't filter out anything.
Florin Andrei wrote:
And you do capture *all* traffic, right? Don't filter out anything.
Well, I'm filtering out the ports that are busy (http and https, imap and imaps, and smtp), but I'm testing access to ssh, since the problem appears to be independent of the port. I get the delays on port 22 and don't see anything in reply. No ICMP messages, either.
On Fri, Sep 05, 2008 at 11:29:22AM -0700, Gordon Messmer wrote:
Florin Andrei wrote:
Maybe it does reply, just on a different interface? Is this a multi-homed system? Bonded interfaces?
There's only one interface with an IP address, and only one route back to the office.
Sounds like the remote server may actually be busy and the number of outstanding connections is reaching the listen() backlog queue, so the remote server isn't doing an accept() and so the three way handshake isn't completing.
That's what I'd anticipate if there was no NAT...
Stephen Harris wrote:
Sounds like the remote server may actually be busy and the number of outstanding connections is reaching the listen() backlog queue, so the remote server isn't doing an accept() and so the three way handshake isn't completing.
That thought crossed my mind, too. However, it doesn't explain why the problem only affects new connections from Linux hosts. Also, I believe that the listen queue is per-port, and this problem affects all ports equally, including oddball ports used by the web server. I'm the only one connecting to those ports.
There are certainly differences between the SYN packets from Linux hosts and those from Windows. On Linux, I see a window size of 5840, and tcp options: MSS=1460, SACK permitted, timestamps, and window scale: 7.
On Windows, the window size is 65535, MSS=1460, and SACK permitted.
However, I have no idea why any of those differences would cause a problem intermittently.
on 9-5-2008 4:51 PM Gordon Messmer spake the following:
Stephen Harris wrote:
Sounds like the remote server may actually be busy and the number of outstanding connections is reaching the listen() backlog queue, so the remote server isn't doing an accept() and so the three way handshake isn't completing.
That thought crossed my mind, too. However, it doesn't explain why the problem only affects new connections from Linux hosts. Also, I believe that the listen queue is per-port, and this problem affects all ports equally, including oddball ports used by the web server. I'm the only one connecting to those ports.
There are certainly differences between the SYN packets from Linux hosts and those from Windows. On Linux, I see a window size of 5840, and tcp options: MSS=1460, SACK permitted, timestamps, and window scale: 7.
On Windows, the window size is 65535, MSS=1460, and SACK permitted.
However, I have no idea why any of those differences would cause a problem intermittently.
Unless there is a router in between somewhere that is broken on its dealing with window scaling. I have had that problem in the past.