[CentOS-devel] Balancing the needs around the RHEL platform

Thu Dec 31 17:24:02 UTC 2020
Lamar Owen <lowen at pari.edu>

On 12/31/20 7:49 AM, Nico Kadel-Garcia wrote:
> If they cost you that much work, consider buying modest Adaptec RAID 
> controllers. MegaRAID had a bad habit of doing a lot of the RAID work 
> in the kernel, not on the attached hardware, which was why the kernel 
> modules were so critical. ...I'm very surprised if you have hardware 
> that actually needs that third-party driver, though it may take work 
> to get the latest kernel into a bootable ISO or PXE image. 
Well:
[root at grymonia ~]# lspci|grep RAID
03:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS 2108 
[Liberator] (rev 05)
[root at grymonia ~]# lspci -n|grep 03.00.0
03:00.0 0104: 1000:0079 (rev 05)

See http://elrepo.org/tiki/DeviceIDs and 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/considerations_in_adopting_rhel_8/hardware-enablement_considerations-in-adopting-rhel-8#removed-adapters_hardware-enablement

Note especially the wording in the latter link: " The following adapters 
from the megaraid_sas driver have been removed: ... SAS0079GEN2, PCI ID 
0x1000:0x0079 ...."  The problem is NOT that megaraid_sas has been 
removed (it hasn't, see below); the problem is that PCI ID 0x1000:0x0079 
has been removed from the in-kernel megaraid_sas driver.  Yes, the 
released kernel HAS a megaraid_sas.ko.xz module, but the PCI ID of the 
controller in the R710 is removed by Red Hat's patches from the released 
module. (This is not the only hardware for which this situation is true; 
it honestly looks like Red Hat asked Dell what Dell wanted supported and 
Dell cut off anything that Dell considers to be "too old.") Here's what 
the modules tree looks like, along with a listing of the contents of the 
ELrepo kmod-megaraid_sas package:

[root at grymonia modules]# find /usr/lib/modules -name megaraid_sas* -print
/usr/lib/modules/4.18.0-193.el8.x86_64/extra/megaraid_sas
/usr/lib/modules/4.18.0-193.el8.x86_64/extra/megaraid_sas/megaraid_sas.ko
/usr/lib/modules/4.18.0-193.14.2.el8_2.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz
/usr/lib/modules/4.18.0-193.14.2.el8_2.x86_64/weak-updates/megaraid_sas
/usr/lib/modules/4.18.0-193.14.2.el8_2.x86_64/weak-updates/megaraid_sas/megaraid_sas.ko
/usr/lib/modules/4.18.0-193.19.1.el8_2.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz
/usr/lib/modules/4.18.0-193.19.1.el8_2.x86_64/weak-updates/megaraid_sas
/usr/lib/modules/4.18.0-193.19.1.el8_2.x86_64/weak-updates/megaraid_sas/megaraid_sas.ko
/usr/lib/modules/4.18.0-193.28.1.el8_2.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz
/usr/lib/modules/4.18.0-193.28.1.el8_2.x86_64/weak-updates/megaraid_sas
/usr/lib/modules/4.18.0-193.28.1.el8_2.x86_64/weak-updates/megaraid_sas/megaraid_sas.ko
[root at grymonia modules]# rpm -ql kmod-megaraid_sas
/etc/depmod.d/kmod-megaraid_sas.conf
/lib/modules/4.18.0-193.el8.x86_64
/lib/modules/4.18.0-193.el8.x86_64/extra
/lib/modules/4.18.0-193.el8.x86_64/extra/megaraid_sas
/lib/modules/4.18.0-193.el8.x86_64/extra/megaraid_sas/megaraid_sas.ko
/usr/share/doc/kmod-megaraid_sas-07.710.50.00
/usr/share/doc/kmod-megaraid_sas-07.710.50.00/GPL-v2.0.txt
/usr/share/doc/kmod-megaraid_sas-07.710.50.00/greylist.txt
[root at grymonia modules]#

It's _not_ a 3rd party driver if the driver is in the released 
kernel.... this is the point that seems to continue to be glossed over, 
that some of these point-release-dependent drivers are restoring 
functionality to already in-kernel modules where that functionality has 
been intentionally removed by Red Hat because Red Hat won't support that 
hardware.  If Red Hat is patching out those PCI IDs for RHEL, it's 
highly likely, in my opinion, that Red Hat isn't going to not patch them 
out in the Stream kernels (yes, I know, double negative, but all Red Hat 
has to do is not patch out the support for that PCI ID to restore support).

> Makes you wonder why someone gave them away?

I actually don't wonder, as it's really simple and I know exactly why 
they donated them to us: they upgraded (the R710 is a few years old 
now), we needed them (they're newer by five years and much better than 
what we were already using), they got a tax credit, and we got effective 
public-support revenue for our 990 (and it delays these R710s from 
becoming e-waste).  It's a win-win; they're still good servers with 
excellent performance for our workloads here, and I tend to ask for 
spares of everything that's donated so I can keep it running.  It is far 
less expensive to deal with these issues than to buy new servers.
--
They say hindsight is 20/20;
Now, can 2020 just be hindsight already?