[CentOS] Weird TCP problem

Fri Sep 5 05:44:25 UTC 2008
Gordon Messmer <yinyang at eburg.com>

Last week, I started seeing very strange behavior in one of the networks
that I manage.

The office LAN uses a Linux firewall which masquerades their
workstations over their DSL connection.  There are probably ~75
workstations in the office LAN.

Their mail server is in a collocated facility nearby.  That server has
an RFC1918 address; its router does SNAT to forward packets to the system.

Both the office firewall and the mail server are currently running fully 
patched CentOS 5.2.

Here's the weird part: If a machine running Linux in the office lan
attempts to connect to the mail server on any TCP port, there's a small
chance that the server will simply ignore the SYN packets.  It doesn't
log any errors.  If I'm running tcpdump, I see the incoming SYN packets,
but no reply.  If I use iptables to log the packets, information about
the packet is saved in the messages file.  If I capture the packets and
use wireshark to analyze them, I don't see anything odd: the checksums
are good and I can't see any difference between a SYN packet that gets a
SYN+ACK and one that's ignored (beyond the obvious: different timestamps
and checksums).  The problem doesn't seem to affect Windows workstations
in the office LAN.  As far as I can tell, only SYN packets are dropped.
   I don't see delays in established connections.

I've attached a file that contains, first, the output of tcpdump which
shows packets to or from the office's firewall address, as recorded by
the destination server.  The first four SYN packets are ignored, but the
kernel proceeds with the TCP handshake after the fifth SYN packet.

Second, the file contains the log messages which are recorded as a
result of these iptables rules:

iptables -A INPUT -p tcp -s officefw --dport 22 -j LOG
iptables -A INPUT -p tcp -s officefw --dport 22 -j ACCEPT

Those are the only iptables rules present on the server accepting the
connections.

Both of those appear to indicate that the server in the colo facility is
receiving the SYN packets.  What possible reasons are there that it
would not reply with SYN+ACK?

-------------- next part --------------
19:08:43.751579 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705398041 0,nop,wscale 7>
19:08:46.751136 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705401041 0,nop,wscale 7>
19:08:52.749305 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705407041 0,nop,wscale 7>
19:09:04.747287 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705419041 0,nop,wscale 7>
19:09:28.741854 IP officefw.57948 > remoteserver.ssh: S 3347102294:3347102294(0) win 5840 <mss1460,sackOK,timestamp 2705443041 0,nop,wscale 7>
19:09:28.742540 IP remoteserver.ssh > officefw.57948: S 3324501337:3324501337(0) ack 3347102295 win 5792 <mss1460,sackOK,timestamp 15960089 2705443041,nop,wscale 7>
19:09:28.783886 IP officefw.57948 > remoteserver.ssh: . ack 1 win 46 <nop,nop,timestamp 2705443083 15960089>
19:09:28.789814 IP remoteserver.ssh > officefw.57948: P 1:21(20) ack 1 win 46 <nop,nop,timestamp 159601372705443083>
19:09:28.829114 IP officefw.57948 > remoteserver.ssh: . ack 21 win 46 <nop,nop,timestamp 2705443129 15960137>


Sep  4 19:08:43 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41807 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 
Sep  4 19:08:46 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41808 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 
Sep  4 19:08:52 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41809 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 
Sep  4 19:09:04 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41810 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 
Sep  4 19:09:28 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=60 TOS=0x00 PREC=0x00 TTL=56 ID=41811 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 
Sep  4 19:09:28 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=52 TOS=0x00 PREC=0x00 TTL=56 ID=41812 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=46 RES=0x00 ACK URGP=0 
Sep  4 19:09:28 remoteserver kernel: IN=eth0 OUT= MAC=00:30:48:97:5a:3a:00:0a:b8:8e:53:29:08:00 SRC=officefw DST=remoteserver LEN=52 TOS=0x00 PREC=0x00 TTL=56 ID=41813 DF PROTO=TCP SPT=57948 DPT=22 WINDOW=46 RES=0x00 ACK URGP=0