Hi all.
I have a strange problem at hand regarding UDP fragmentation on Centos7: Applications are unable to receive UDP packets which have undergone fragmentation UNLESS the netfilter modules are loaded.
The problem arose on a application which would run fine on OpenSuse but does not work on Centos7. The application processes UDP data and on Centos only small packets are received and processed, packets below the fragmentation size limit of about 1500 bytes. UDP packets which have undergone fragmentation are not received by the application.
The application in question uses Qt, which opens the UDP socket in non-blocking mode - apparently that's an issue because reading from the socket in blocking mode does not cause the problem.
By chance I hit on the fact that once the netfilter kernel-modules (nf_nat, iptable_nat, nf_nat ...) are loaded the problem disappears and UDP packets of all sizes are correctly delivered and processed.
NOTES: - I'm not using netfilter. My iptables are empty, firewalld is not running.
- Other networking applications -at least tcp- are working fine: webbrowsing, ssh, nfs etc even DNS
- Does not happen on Opensuse regardless if netfilter modules are loaded or not.
- Does not happen on Opensuse on the same machine. Does happen on different machines on Centos7. So it's not HW dependend
- There is AFAIK nothing special about my Centos7 installation. Out of the box install, simple network config, latest updates applied.
- Rebuilding the application on Centos7 with centos supplied gcc, libs etc does not make the problem go away.
- I have broken the application down to a small Qt test program which opens a UDP socket, binds and waits on it
This is an strace output of the problem, where a 10000 byte UDP packet is send to the application, triggers the select(), then the recvfrom(7...) fails with eagain [...] socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_IP) = 7 fcntl(7, F_GETFL) = 0x2 (flags O_RDWR) fcntl(7, F_SETFL, O_RDWR|O_NONBLOCK) = 0 setsockopt(7, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0 bind(7, {sa_family=AF_INET, sin_port=htons(10001), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 getsockname(7, {sa_family=AF_INET, sin_port=htons(10001), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0 getpeername(7, 0x7ffdf3073470, [16]) = -1 ENOTCONN (Transport endpoint is not connected) getsockopt(7, SOL_SOCKET, SO_TYPE, [2], [4]) = 0 select(8, [3 7], [], [], NULL) = 1 (in [7]) recvfrom(7, 0x7ffdf3072e1b, 1, 2, 0x7ffdf3072e20, 0x7ffdf3072e1c) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3 7], [], [], NULL [...]
And after the netfilter modules are loaded: [...] socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_IP) = 7 fcntl(7, F_GETFL) = 0x2 (flags O_RDWR) fcntl(7, F_SETFL, O_RDWR|O_NONBLOCK) = 0 setsockopt(7, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0 bind(7, {sa_family=AF_INET, sin_port=htons(10001), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 getsockname(7, {sa_family=AF_INET, sin_port=htons(10001), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0 getpeername(7, 0x7ffc5939e8c0, [16]) = -1 ENOTCONN (Transport endpoint is not connected) getsockopt(7, SOL_SOCKET, SO_TYPE, [2], [4]) = 0 select(8, [3 7], [], [], NULL) = 1 (in [7]) recvfrom(7, "x", 1, MSG_PEEK, {sa_family=AF_INET, sin_port=htons(60921), sin_addr=inet_addr("10.77.32.30")}, [16]) = 1 recvfrom(7, "x", 1, MSG_PEEK, {sa_family=AF_INET, sin_port=htons(60921), sin_addr=inet_addr("10.77.32.30")}, [16]) = 1 recvfrom(7, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 65536, 0, {sa_family=AF_INET, sin_port=htons(60921), sin_addr=inet_addr("10.77.32.30")}, [16]) = 10000 recvfrom(7, 0x7ffc5939e0bb, 1, 2, 0x7ffc5939e0c0, 0x7ffc5939e0bc) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3 7], [], [], NULL [...]
Any help? bug?
Regards .....Volker