I spent a bunch of time researching TIME_WAIT on linux and didn't find much useful information. There's a couple kernel parameters to change the settings though the only docs for them that I could find say don't touch them unless you REALLY know what your doing
Only things I found are the hardcoded values in include/net/tcp.h:
#define TCP_TIMEWAIT_LEN (60*HZ) /* how long to wait to destroy TIME-WAIT * state, about 60 seconds */ #define TCP_FIN_TIMEOUT TCP_TIMEWAIT_LEN /* BSD style FIN_WAIT2 deadlock breaker. * It used to be 3min, new value is 60sec, * to combine FIN-WAIT-2 timeout with * TIME-WAIT timer. */
Our "issue" is on the LAN side: front servers connecting to the dbs. So I wonder if 60s is not too long for the delayed packets problem, when the sources and the targets are one gigabit switch away...
The app that runs on that box is very high volume, so we get a large number of TIME_WAITs, during performance testing on a dual proc quad core we can get up to 63,000 of them.
Hum... I think I just understood why I cap around 14,000 in my tests... cat /proc/sys/net/ipv4/ip_local_port_range 32768 61000 (61000-32768)/2 = 14116 Could that be it?
So IMO don't worry about time waits unless your seriously in the 10s of thousands, at which point you may want to think about optimizing the traffic flow to your systems like we did with our load balancers.
We already use LVS+keepalived and it seems to work fine so far (except when I tested 1.1.16 ^_^).
Thx, JD