Hi All,
We looking for suggestions on dealing with mellanox drivers in CentOS 6.7
We tried installing mellanox drivers (MLNX_OFED_LINUX-3.2-2.0.0.0-rhel6.7-x86_64) on a Quanta Cirrascale server running Centos 6.7 - 2.6.32-573.22.1.el6.x86_64. When we rebooted the machine after installing the drivers, it went into a kernel panic for every installed kernel except for Centos 6.7 2.6.32-573.22.1.el6.x86_64.debug. After we uninstalled the drivers, the machine failed to boot for any installed kernel.
Any suggestions on how to proceed would be greatly appreciated.
Thanks
On Tue, 24 May 2016 21:08:27 -0400 Pat Haley phaley@mit.edu wrote:
Hi All,
We looking for suggestions on dealing with mellanox drivers in CentOS 6.7
Unless you really need a specific feature in MOFED I'd recommend you stay with the, so called, in-box drivers already in CentOS-6.7.
We run >2000 HPC nodes on CentOS-6.7 with stock ib drivers.
We tried installing mellanox drivers (MLNX_OFED_LINUX-3.2-2.0.0.0-rhel6.7-x86_64) on a Quanta Cirrascale server running Centos 6.7 - 2.6.32-573.22.1.el6.x86_64. When we rebooted the machine after installing the drivers, it went into a kernel panic for every installed kernel except for Centos 6.7
Sounds as if your initramfs got assembled incorrectly. Kernel panic is usually due to the initramfs not having what it takes to find and mount the root-fs. You can try to manually update the initrd for a specific installed kernel.
/Peter K
On 25/05/16 03:08, Pat Haley wrote:
Hi All,
We looking for suggestions on dealing with mellanox drivers in CentOS 6.7
We tried installing mellanox drivers (MLNX_OFED_LINUX-3.2-2.0.0.0-rhel6.7-x86_64) on a Quanta Cirrascale server running Centos 6.7 - 2.6.32-573.22.1.el6.x86_64. When we rebooted the machine after installing the drivers, it went into a kernel panic for every installed kernel except for Centos 6.7 2.6.32-573.22.1.el6.x86_64.debug. After we uninstalled the drivers, the machine failed to boot for any installed kernel.
Any suggestions on how to proceed would be greatly appreciated.
Thanks
Well, we (CentOS) are using a gluster setup on top of Infiniband, but we're just using the default mlx4_ib kernel module that is included with the kernel shipped with 6.7 (/lib/modules/2.6.32-573.22.1.el6.x86_64/kernel/drivers/infiniband/hw/mlx4/mlx4_ib.ko) so nothing to be done at the kernel/initrd level.
Is there a reason why you needed a different version ?
PS : the IB HBA model we have in those servers is the following one : 81:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)
We have a new install of CentOS 6.7 with infiniband support installed. We can see the card in hardware and we can see the mlx4 drivers loaded in the kernel but cannot see the card as an ethernet interface, using ifconfig -a. Can you recommend an install procedure to see this as an ethernet interface?
Thanks
On 05/25/2016 07:32 AM, Fabian Arrotin wrote:
On 25/05/16 03:08, Pat Haley wrote:
Hi All,
We looking for suggestions on dealing with mellanox drivers in CentOS 6.7
We tried installing mellanox drivers (MLNX_OFED_LINUX-3.2-2.0.0.0-rhel6.7-x86_64) on a Quanta Cirrascale server running Centos 6.7 - 2.6.32-573.22.1.el6.x86_64. When we rebooted the machine after installing the drivers, it went into a kernel panic for every installed kernel except for Centos 6.7 2.6.32-573.22.1.el6.x86_64.debug. After we uninstalled the drivers, the machine failed to boot for any installed kernel.
Any suggestions on how to proceed would be greatly appreciated.
Thanks
Well, we (CentOS) are using a gluster setup on top of Infiniband, but we're just using the default mlx4_ib kernel module that is included with the kernel shipped with 6.7 (/lib/modules/2.6.32-573.22.1.el6.x86_64/kernel/drivers/infiniband/hw/mlx4/mlx4_ib.ko) so nothing to be done at the kernel/initrd level.
Is there a reason why you needed a different version ?
PS : the IB HBA model we have in those servers is the following one : 81:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
On Wed, 25 May 2016 11:48:55 -0400 Pat Haley phaley@mit.edu wrote:
We have a new install of CentOS 6.7 with infiniband support installed. We can see the card in hardware and we can see the mlx4 drivers loaded in the kernel but cannot see the card as an ethernet interface, using ifconfig -a. Can you recommend an install procedure to see this as an ethernet interface?
It's not an ethernet interface. I guess you're referring to a TCP/IP interface.
TCP/IP is provided on infiniband by the IPoIB ULP (upper level protocol) and the ib_ipoib kernel module.
ibstat and ibv_devinfo are commands for showing infiniband interfaces.
/Peter K
Did You take a Look under Sys Class net? On Our Cluster i create The ifcfg Bye Hand.
Von meinem iPhone gesendet
Am 26.05.2016 um 13:38 schrieb Peter Kjellström cap@nsc.liu.se:
On Wed, 25 May 2016 11:48:55 -0400 Pat Haley phaley@mit.edu wrote:
We have a new install of CentOS 6.7 with infiniband support installed. We can see the card in hardware and we can see the mlx4 drivers loaded in the kernel but cannot see the card as an ethernet interface, using ifconfig -a. Can you recommend an install procedure to see this as an ethernet interface?
It's not an ethernet interface. I guess you're referring to a TCP/IP interface.
TCP/IP is provided on infiniband by the IPoIB ULP (upper level protocol) and the ib_ipoib kernel module.
ibstat and ibv_devinfo are commands for showing infiniband interfaces.
/Peter K
-- Sent from my Android device with K-9 Mail. Please excuse my brevity. _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
for example:
[andy@node01 /]$ cat /sys/class/net/ib0/address 80:00:00:48:fe:80:00:00:00:00:00:00:f4:52:14:03:00:78:3d:4
[andy@node01 /]$ cat /etc/sysconfig/network-scripts/ifcfg-ib0 DEVICE="ib0" TYPE=Infiniband ONBOOT="yes" BOOTPROTO="static" NM_CONTROLLED="yes" HWADDR=80:00:00:48:fe:80:00:00:00:00:00:00:f4:52:14:03:00:78:3d:41 NETWORK=192.168.200.0 BROADCAST=192.168.200.255 IPADDR=192.168.200.101 NETMASK=255.255.255.0
ifcfg-ib0 file made by hand, because we integrate the cards later.
Sincerely
Andy