[CentOS] Socket behavior change from 6.5 to 6.6
Gordon Messmer
gordon.messmer at gmail.com
Wed Jan 21 18:09:55 UTC 2015
On 01/21/2015 08:49 AM, Glenn Eychaner wrote:
> Diagnosis:
> the previous behavior of
> receiving a 0-length recv() on the old server socket is unsupported and
> unreliable.
You mention that a lot, and it might help to understand why that happens.
A 0 length recv() on a standard (blocking) socket indicates end-of-file.
The remote side has closed the connection.
What you were previously seeing was the client sending SYN to establish
a new connection. Because it was unrelated to the existing connection
on the same 5-tuple, the server's TCP stack closed the existing socket.
I'm not positive, but the server may have sent a keepalive or other
probe to the client and got a RST. Either way, the kernel determined
that the socket had been closed by the client, and a 0-length read
(recv) is the way that the kernel informs an application of that closure.
> Until the update to CentOS 6.6 'broke' the existing functionality,
> I had never looked deeply into the connection between the client and the
> server; it 'just worked', so I left it alone. Once it did break, I realized
> that because the client was connecting on the same port every time, the
> whole setup might have been relying on unsupported behavior.
Not just unsupported, but incorrect. Unrelated packets with a 5-tuple
matching an established socket are typically injection attacks. TCP is
supposed to discard them.
> Other diagnostics:
> One test I intend to run in a couple of weeks (next opportunity) is to boot
> the CentOS 6.6 box with the older kernel, in order to find out whether the
> behavior change is in the kernel or in the libraries.
It's always good to test, but it's almost certainly the kernel.
Libraries don't decide whether or not a socket has closed, which is what
the 0-length read (recv) indicates.
> Correct solutions:
> 1) Client port: The client should be connecting on a random, ephemeral port
Yes.
> 2) Protocol change: The server never writes to the socket in the existing
> protocol, and can therefore never find out that the connection is dead.
> Writing to the socket would reveal this. But what happens if the server writes
> to the socket, and the client never reads?
You will eventually fill up a buffer on one side or the other, and at
that point any further write (send) will block forever.
> 3) Several people suggested using SO_REUSEADDR and/or an SO_LINGER of zero to
> drop the socket out of TIME_WAIT, but does the socket enter TIME_WAIT as soon
> as the client crashes? I didn't think so, but I may be wrong.
No. It enters TIME_WAIT when the socket closes. If the socket were
closing, you'd be getting a 0-length read (recv). You can confirm that
with "netstat"
More information about the CentOS
mailing list