[CentOS] Difficulty configuring RDMA in CentOS

Wed Jun 13 22:57:13 UTC 2018
Pat Haley <phaley at mit.edu>

Hi

We are trying to configure RDMA for an infiniband connection between our 
data server (running CentOS 6.8) and our compute nodes (running CentOS 
6.6).  We have been trying to follow the instructions in

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-configuring_the_base_rdma_subsystemedit 



however we are getting conflicting information on whether the RDMA is 
correctly configured.    Some of what we have done and some of the data 
are below. Can you suggest what other tests and what other data we 
should get to debug this problem?

/etc/rdma/mlx4.conf to set the port types properly for RoCE/IBoE usage.

Edit /etc/modprobe.d/mlx4.conf to instruct the driver on which packet 
priority is configured for the “no-drop” service on the Ethernet 
switches the cards are plugged into.

vim /etc/rdma/mlx4.conf

You can find the right pci device to use for any given card by loading

# the mlx4_core module, then going to /sys/bus/pci/drivers/mlx4_core and

# seeing what possible PCI devices are listed there.The possible values

# for ports are: ib, eth, and auto.However, not all cards support all

# types, so if you get messages from the kernel that your selected port

# type isn't supported, there's nothing this script can do about it.

[root at mseas-data2 mlx4_core]# ls /sys/bus/pci/drivers/mlx4_core

0000:81:00.0bindmodulenew_idremove_idueventunbind

mstconfig -d 81:00.0 query

Device #1:

----------

Device type:ConnectX3Pro

PCI device:81:00.0

Configurations:Current

SRIOV_EN1

NUM_OF_VFS8

LINK_TYPE_P13

LINK_TYPE_P23

LOG_BAR_SIZE3

BOOT_PKEY_P10

BOOT_PKEY_P20

BOOT_OPTION_ROM_EN_P11

BOOT_VLAN_EN_P10

BOOT_RETRY_CNT_P10

LEGACY_BOOT_PROTOCOL_P11

BOOT_VLAN_P11

BOOT_OPTION_ROM_EN_P21

BOOT_VLAN_EN_P20

BOOT_RETRY_CNT_P20

LEGACY_BOOT_PROTOCOL_P21

BOOT_VLAN_P21

Needed packages:

Most if not all packages appear to be installed on server







[root at mseas-data2 ~]# service rdma status
Low level hardware support loaded:
         mlx4_ib

Upper layer protocol modules:
         ib_ipoib

User space access modules:
         rdma_ucm ib_ucm ib_uverbs ib_umad

Connection management modules:
         rdma_cm ib_cm iw_cm

Configured IPoIB interfaces: none
Currently active IPoIB interfaces: ib0