[CentOS-devel] Balancing the needs around the RHEL platform

Thu Dec 24 01:26:52 UTC 2020
Neal Gompa <ngompa13 at gmail.com>

On Wed, Dec 23, 2020 at 7:43 PM Mark Mielke <mark.mielke at gmail.com> wrote:
> On Wed, Dec 23, 2020 at 7:15 PM Neal Gompa <ngompa13 at gmail.com> wrote:
> > On Wed, Dec 23, 2020 at 4:38 PM Phil Perry <pperry at elrepo.org> wrote:
> > > If Red Hat really wanted to fix this in (a) kernel, the solution would
> > > have been to accept the repeated upstream requests to backport the
> > > driver into the RHEL kernel, but that idea/request has been rejected.
> > No. The correct fix here is to start blocking RHEL kernel updates
> > against third-party Free Software kernel module packages to ensure
> > compatibility isn't broken and the kernel ABI stops breaking on every
> > kernel version series. The reason it keeps breaking is because there's
> > no current mechanism in which these are tested together to validate
> > them for release.
> I think you are correct. I also think there is a long-ish road to get
> here. :-) Overall, it would have the best long-term results. It
> requires everyone that has requirements, document their requirements
> as automated tests.

I'm more optimistic. The RHEL kernel for RHEL 9 is already being
developed in Fedora ELN[0] through the Always Ready Kernel project[1].
As for RHEL 8 and CentOS Stream 8, we can wire up validation testing
using the Zuul instance that the project had stood up as part of
Fedora CI work. That infrastructure integrates with the CentOS Pagure
server and we can do all kinds of interesting things with it.

[0]: https://docs.fedoraproject.org/en-US/eln/
[1]: https://gitlab.com/cki-project/kernel-ark

> But, it would put a damper on "new feature that needs  large kernel
> ABI changes to cost effectively backport", such as the OverlayFS
> changes done in RHEL 7 as one of many such examples. The choice to use
> Linux 4.18 is particularly problematic, since it wasn't an LTS kernel.
> :-(
> 5 years is a long time to wait for new breaking features in the kernel.

I think we'd be in a better place aiming for it and merely reducing
the number of times it breaks. Right now, kABI breaks pretty
significantly on every single RHEL point release. And there have been
*several* botched backports that have screwed up both the RHEL kernel
API and kernel module builds. Even just cutting the number of times
that these kinds of breakages happen in half would be a major win.

To your comment about Red Hat not using LTS kernels, LTS kernels do
not maintain kABI upstream either, so it doesn't save any effort for
Red Hat one way or another. If anything, being based on an LTS kernel
would do Red Hat less favors because they're under pressure to conform
to something similar to the upstream LTS kernel. Since the upstream
LTS kernel already doesn't match the RHEL kernel lifecycle and Red Hat
engineers would wind up doing a bunch of work anyway for the kABI
stabilization and live kernel patching features, the non-LTS kernels
are strategically better because there's less churn in them and more
long-term flexibility.

> > More than most, I get why you're upset about the kABI always breaking
> > as kernel updates push out, but instead of just saying "it's not
> > suitable", we should be building solutions to *make* it suitable for
> > the Enterprise. It's *bad* that the RHEL kernel breaks its own
> > promises so often (which is a relatively new thing, in my experience),
> > and we should be implementing safeguards to stop it from happening
> > going forward.
> Yes. Although, in the mean-time...

Well, since nothing can really change now, I'm looking forward a bit
and trying to see how we can take advantage of the situation to better
the wider community. After all, the biggest virtue of a true open
source community is the ability to adapt.

真実はいつも一つ!/ Always, there's only one truth!