Hi,
The kmods SIG looks like a great place to introduce the out-of-tree versions of the ENA and EFA drivers for supporting Amazon EC2 instances. If this is the right place for that work, I would like to add my support by contributing effort to maintain them here.
On 02/06/2021 19.02, David Duncan via CentOS-devel wrote:
Hi,
The kmods SIG looks like a great place to introduce the out-of-tree versions of the ENA and EFA drivers for supporting Amazon EC2 instances. If this is the right place for that work, I would like to add my support by contributing effort to maintain them here.
I don't know the ENA and EFA drivers in details, so my conclusion may not be accurate. But after having a short look I can not see any immediate reasons that speak against it.
Only the optional GPUDirect RDMA for EFA may not be enabled as it depends on the proprietary nVidia driver which we can not provide due to legal reasons.
And I have not had a close look at the rdma-core and/or libfabric packages in CentOS Stream 8 for some while, hence I do not know whether these have enabled all bits required by the EFA driver. However I can vaguely remember seeing updates to add EFA support for both. But even without the kernel modules themselves might be useful for some users.
Tl;dr: IMO the kmods SIG is a good place for such work.
Probably the easiest way to maintain these is to join the (proposed) kmods SIG. Other SIG members can help with packaging related issues, but making sure that both drivers still compile and work after a new kernel update would then probably be mainly up to you.
Please let me know if you want to join this SIG.
On 02/06/2021 21.07, Peter Georg wrote:
On 02/06/2021 19.02, David Duncan via CentOS-devel wrote:
Hi,
The kmods SIG looks like a great place to introduce the out-of-tree versions of the ENA and EFA drivers for supporting Amazon EC2 instances. If this is the right place for that work, I would like to add my support by contributing effort to maintain them here.
I don't know the ENA and EFA drivers in details, so my conclusion may not be accurate. But after having a short look I can not see any immediate reasons that speak against it.
Only the optional GPUDirect RDMA for EFA may not be enabled as it depends on the proprietary nVidia driver which we can not provide due to legal reasons.
And I have not had a close look at the rdma-core and/or libfabric packages in CentOS Stream 8 for some while, hence I do not know whether these have enabled all bits required by the EFA driver. However I can vaguely remember seeing updates to add EFA support for both. But even without the kernel modules themselves might be useful for some users.
Tl;dr: IMO the kmods SIG is a good place for such work.
Probably the easiest way to maintain these is to join the (proposed) kmods SIG. Other SIG members can help with packaging related issues, but making sure that both drivers still compile and work after a new kernel update would then probably be mainly up to you.
Please let me know if you want to join this SIG.
As Josh noted in his answer, these two drivers are available in EL8. Somehow I thought these are only available in EL9, but not in EL8. Sorry, my fault. Hence my answer above is incorrect and should be ignored (at least for now). It is first required to show that these out-of-tree version are indeed sufficiently different to the versions provided by the EL kernel.
On Wed, Jun 2, 2021 at 1:02 PM David Duncan via CentOS-devel centos-devel@centos.org wrote:
Hi,
The kmods SIG looks like a great place to introduce the out-of-tree versions of the ENA and EFA drivers for supporting Amazon EC2 instances. If this is the right place for that work, I would like to add my support by contributing effort to maintain them here.
RHEL ships ENA and EFA drivers. Are these the same drivers you're referring to? If not, can you describe how they're different?
I ask because if RHEL ships a driver then Stream will ship the same driver directly. It wouldn't make much sense to have them in a SIG unless they were somehow different, and even then there would be interesting issues to work out around conflicting drivers.
josh