I have never played with Infinibad, but I think that those cards most probably allow some checksum offloading capabilities. Have you explored in that direction and test with checksum in ofloaded mode ? Best Regards, Strahil Nikolov В 15:49 +0000 на 25.01.2021 (пн), lejeczek via CentOS написа: > > On 22/01/2021 00:33, Steven Tardy wrote: > > On Thu, Jan 21, 2021 at 6:34 PM lejeczek via CentOS > > <centos at centos.org <mailto:centos at centos.org>> wrote: > > > > Hi guys. > > > > Hoping some net experts my stumble upon this message, > > I have > > an IPoIB direct host to host connection and: > > > > -> $ ethtool ib1 > > Settings for ib1: > > Supported ports: [ ] > > Supported link modes: Not reported > > Supported pause frame use: No > > Supports auto-negotiation: No > > Supported FEC modes: Not reported > > Advertised link modes: Not reported > > Advertised pause frame use: No > > Advertised auto-negotiation: No > > Advertised FEC modes: Not reported > > Speed: 40000Mb/s > > Duplex: Full > > Auto-negotiation: on > > Port: Other > > PHYAD: 255 > > Transceiver: internal > > Link detected: yes > > > > and that's both ends, both hosts, yet: > > > > > $ iperf3 -c 10.5.5.97 > > Connecting to host 10.5.5.97, port 5201 > > [ 5] local 10.5.5.49 port 56874 connected to > > 10.5.5.97 port > > 5201 > > [ ID] Interval Transfer Bitrate > > Retr Cwnd > > [ 5] 0.00-1.00 sec 1.36 GBytes 11.6 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 1.00-2.00 sec 1.87 GBytes 16.0 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 2.00-3.00 sec 1.84 GBytes 15.8 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 3.00-4.00 sec 1.83 GBytes 15.7 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 4.00-5.00 sec 1.61 GBytes 13.9 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 5.00-6.00 sec 1.60 GBytes 13.8 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 6.00-7.00 sec 1.56 GBytes 13.4 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 7.00-8.00 sec 1.52 GBytes 13.1 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 8.00-9.00 sec 1.52 GBytes 13.1 Gbits/sec 0 > > 2.50 MBytes > > [ 5] 9.00-10.00 sec 1.52 GBytes 13.1 Gbits/sec 0 > > 2.50 MBytes > > - - - - - - - - - - - - - - - - - - - - - - - - - > > [ ID] Interval Transfer Bitrate Retr > > [ 5] 0.00-10.00 sec 16.2 GBytes 13.9 Gbits/sec > > 0 sender > > [ 5] 0.00-10.00 sec 16.2 GBytes 13.9 > > Gbits/sec receiver > > > > It's rather an oldish platform which hosts the link, > > PCIe is > > only 2.0 but with link of x8 that should be able to carry > > more than ~13Gbits/sec. > > Infiniband is Mellanox's ConnectX-3. > > > > Any thoughts on how to track the bottleneck or any > > thoughts > > > > > > > > Care to capture (a few seconds) of the *sender* side .pcap? > > Often TCP receive window is too small or packet loss is to > > blame or round-trip-time. > > All of these would be evident in the packet capture. > > > > If you do multiple streams with the `-P 8` flag does that > > increase the throughput? > > > > Google says these endpoints are 1.5ms apart: > > > > (2.5 megabytes) / (13 Gbps) = > > 1.53846154 milliseconds > > > > > > > Seems that the platform in overall might not be enough. That > bitrate goes down even further when CPUs are fully loaded & > occupied. > (I'll try to keep on investigating) > > What I'm trying next is to have both ports(a dual-port card) > "teamed" by NM, with runner set to broadcast. I'm leaving > out "p-key" which NM sets to "default"(which is working with > a "regular" IPoIP connection) > RHEL's "networking guide" docs say "...create a team from > two or more Wired or InfiniBand connections..." > When I try to stand up such a team, master starts but > slaves, both, fail with: > "... > <info> [1611588576.8887] device (ib1): Activation: starting > connection 'team1055-slave-ib1' > (900d5073-366c-4a40-8c32-ac42c76f9c2e) > <info> [1611588576.8889] device (ib1): state change: > disconnected -> prepare (reason 'none', sys-iface-state: > 'managed') > <info> [1611588576.8973] device (ib1): state change: > prepare -> config (reason 'none', sys-iface-state: 'managed') > <info> [1611588576.9199] device (ib1): state change: config > -> ip-config (reason 'none', sys-iface-state: 'managed') > <warn> [1611588576.9262] device (ib1): Activation: > connection 'team1055-slave-ib1' could not be enslaved > <info> [1611588576.9272] device (ib1): state change: > ip-config -> failed (reason 'unknown', sys-iface-state: > 'managed') > <info> [1611588576.9280] device (ib1): released from master > device nm-team > <info> [1611589045.6268] device (ib1): carrier: link connected > ..." > > Any suggestions also appreciated. > thanks, L > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos