[CentOS] Infiniband special ops?

Mon Jan 25 15:49:26 UTC 2021
lejeczek <peljasz at yahoo.co.uk>


On 22/01/2021 00:33, Steven Tardy wrote:
> On Thu, Jan 21, 2021 at 6:34 PM lejeczek via CentOS 
> <centos at centos.org <mailto:centos at centos.org>> wrote:
>
>     Hi guys.
>
>     Hoping some net experts my stumble upon this message,
>     I have
>     an IPoIB direct host to host connection and:
>
>     -> $ ethtool ib1
>     Settings for ib1:
>          Supported ports: [  ]
>          Supported link modes:   Not reported
>          Supported pause frame use: No
>          Supports auto-negotiation: No
>          Supported FEC modes: Not reported
>          Advertised link modes:  Not reported
>          Advertised pause frame use: No
>          Advertised auto-negotiation: No
>          Advertised FEC modes: Not reported
>          Speed: 40000Mb/s
>          Duplex: Full
>          Auto-negotiation: on
>          Port: Other
>          PHYAD: 255
>          Transceiver: internal
>          Link detected: yes
>
>     and that's both ends, both hosts, yet:
>
>      > $ iperf3 -c 10.5.5.97
>     Connecting to host 10.5.5.97, port 5201
>     [  5] local 10.5.5.49 port 56874 connected to
>     10.5.5.97 port
>     5201
>     [ ID] Interval           Transfer     Bitrate        
>     Retr Cwnd
>     [  5]   0.00-1.00   sec  1.36 GBytes  11.6 Gbits/sec    0
>     2.50 MBytes
>     [  5]   1.00-2.00   sec  1.87 GBytes  16.0 Gbits/sec    0
>     2.50 MBytes
>     [  5]   2.00-3.00   sec  1.84 GBytes  15.8 Gbits/sec    0
>     2.50 MBytes
>     [  5]   3.00-4.00   sec  1.83 GBytes  15.7 Gbits/sec    0
>     2.50 MBytes
>     [  5]   4.00-5.00   sec  1.61 GBytes  13.9 Gbits/sec    0
>     2.50 MBytes
>     [  5]   5.00-6.00   sec  1.60 GBytes  13.8 Gbits/sec    0
>     2.50 MBytes
>     [  5]   6.00-7.00   sec  1.56 GBytes  13.4 Gbits/sec    0
>     2.50 MBytes
>     [  5]   7.00-8.00   sec  1.52 GBytes  13.1 Gbits/sec    0
>     2.50 MBytes
>     [  5]   8.00-9.00   sec  1.52 GBytes  13.1 Gbits/sec    0
>     2.50 MBytes
>     [  5]   9.00-10.00  sec  1.52 GBytes  13.1 Gbits/sec    0
>     2.50 MBytes
>     - - - - - - - - - - - - - - - - - - - - - - - - -
>     [ ID] Interval           Transfer     Bitrate         Retr
>     [  5]   0.00-10.00  sec  16.2 GBytes  13.9 Gbits/sec
>     0             sender
>     [  5]   0.00-10.00  sec  16.2 GBytes  13.9
>     Gbits/sec                  receiver
>
>     It's rather an oldish platform which hosts the link,
>     PCIe is
>     only 2.0 but with link of x8 that should be able to carry
>     more than ~13Gbits/sec.
>     Infiniband is Mellanox's ConnectX-3.
>
>     Any thoughts on how to track the bottleneck or any
>     thoughts
>
>
>
> Care to capture (a few seconds) of the *sender* side .pcap?
> Often TCP receive window is too small or packet loss is to 
> blame or round-trip-time.
> All of these would be evident in the packet capture.
>
> If you do multiple streams with the `-P 8` flag does that 
> increase the throughput?
>
> Google says these endpoints are 1.5ms apart:
>
> (2.5 megabytes) / (13 Gbps) =
> 1.53846154 milliseconds
>
>
>
Seems that the platform in overall might not be enough. That 
bitrate goes down even further when CPUs are fully loaded & 
occupied.
(I'll try to keep on investigating)

What I'm trying next is to have both ports(a dual-port card) 
"teamed" by NM, with runner set to broadcast. I'm leaving 
out "p-key" which NM sets to "default"(which is working with 
a "regular" IPoIP connection)
RHEL's "networking guide" docs say "...create a team from 
two or more Wired or InfiniBand connections..."
When I try to stand up such a team, master starts but 
slaves, both, fail with:
"...
<info>  [1611588576.8887] device (ib1): Activation: starting 
connection 'team1055-slave-ib1' 
(900d5073-366c-4a40-8c32-ac42c76f9c2e)
<info>  [1611588576.8889] device (ib1): state change: 
disconnected -> prepare (reason 'none', sys-iface-state: 
'managed')
<info>  [1611588576.8973] device (ib1): state change: 
prepare -> config (reason 'none', sys-iface-state: 'managed')
<info>  [1611588576.9199] device (ib1): state change: config 
-> ip-config (reason 'none', sys-iface-state: 'managed')
<warn>  [1611588576.9262] device (ib1): Activation: 
connection 'team1055-slave-ib1' could not be enslaved
<info>  [1611588576.9272] device (ib1): state change: 
ip-config -> failed (reason 'unknown', sys-iface-state: 
'managed')
<info>  [1611588576.9280] device (ib1): released from master 
device nm-team
<info>  [1611589045.6268] device (ib1): carrier: link connected
..."

Any suggestions also appreciated.
thanks, L