[CentOS-devel] Balancing the needs around the RHEL platform

Thu Dec 24 13:37:29 UTC 2020
Neal Gompa <ngompa13 at gmail.com>

On Thu, Dec 24, 2020 at 8:22 AM Phil Perry <pperry at elrepo.org> wrote:
>
> On 24/12/2020 00:14, Neal Gompa wrote:
> > On Wed, Dec 23, 2020 at 4:38 PM Phil Perry <pperry at elrepo.org> wrote:
> >>
> >> On 23/12/2020 20:50, Matthew Miller wrote:
> >>> On Wed, Dec 23, 2020 at 08:23:29PM +0000, Phil Perry wrote:
> >>>> Take Wireguard VPN as an example. No sooner than upstream fixed the
> >>>> breakage caused by -257 on Monday, -259 landed and broke it
> >>>> again[2].
> >>>
> >>>
> >>> It seems like Wireguard might be a good example of something for an
> >>> alternate kernel maintained by a SIG. (Like the Xen SIG does.)
> >>>
> >>
> >> Why would you do that? The method we use in Enterprise Linux to deliver
> >> 3rd party out-of-tree drivers is the RHEL Driver Update Programme. It
> >> has been this way for over a decade. It works really well. It just
> >> doesn't work for Stream because the Stream kernel is not suitable for
> >> end user (Enterprise) consumption - it is a development kernel for
> >> developing the next RHEL point release.
> >>
> >> If Red Hat really wanted to fix this in (a) kernel, the solution would
> >> have been to accept the repeated upstream requests to backport the
> >> driver into the RHEL kernel, but that idea/request has been rejected.
> >>
> >
> > No. The correct fix here is to start blocking RHEL kernel updates
> > against third-party Free Software kernel module packages to ensure
> > compatibility isn't broken and the kernel ABI stops breaking on every
> > kernel version series. The reason it keeps breaking is because there's
> > no current mechanism in which these are tested together to validate
> > them for release.
> >
>
> Blocking Stream kernel updates you mean?
>
> That would certainly be an option, and I have written a yum plugin (for
> el7) that does the reverse and masks kmod packages from the yum
> transaction where the required kernel is not available yet. But for such
> an approach to work, it is essential that the Stream repository contains
> all kernel releases, not just the latest as is the case at present.
>
> Further, we have an issue with the Stream installation images which are
> constantly being updates during the latest compose and feature the
> latest Stream kernel - these are unable to use Driver Update Disk images
> (DUDs) which are generally built around the point release GA kernel and
> are likely not compatible with newer Stream kernels.
>

I'm a bit more ambitious here: I'd like kernel updates to not be
released *at all* to users unless it's validated alongside kernel
module packages.

> > The LF/RH/SUSE kernel module packaging system (branded as the Driver
> > Update Program by Red Hat) relies on one of two things happening to be
> > reasonably successful:
> >
> > * Gating to ensure kABI doesn't break (RHEL-style)
> > * Continuous automatic rebuilds as the kABI changes (SUSE-style)
> >
> > At work, we've internally implemented the SUSE-style strategy with our
> > RHEL kernel module builds, but we're able to do that because our build
> > system is designed to handle that. Within the CentOS Project with
> > CKI/ARK and CentOS Stream, we should be implementing the RHEL-style
> > strategy.
> >
> > More than most, I get why you're upset about the kABI always breaking
> > as kernel updates push out, but instead of just saying "it's not
> > suitable", we should be building solutions to *make* it suitable for
> > the Enterprise. It's *bad* that the RHEL kernel breaks its own
> > promises so often (which is a relatively new thing, in my experience),
> > and we should be implementing safeguards to stop it from happening
> > going forward.
> >
> >
>
> To be fair to Red Hat, they are not breaking their own promises (nor
> even the kABI by their own definition) as Red Hat only strive to retain
> kABI compatibility for symbols on their own defined whitelist.
>
> What happens in reality (especially in the first 5 years during the
> active development phase or Stream phase) is that Red Hat branch the
> RHEL kernel at point release time and the 8.3 kernel, for example, stays
> stable for 6 months with only important bug fix and security fixes, but
> no new features whilst the RHEL development kernel branch for 8.4, which
> is now being released to Stream, gets all the big backports that will be
> in the 8.4 kernel, and those backports are what causes breakage of
> symbols that are not on the kABI whitelist but are used in the real
> world by many/most 3rd party drivers.
>

Right, but I think Brendan Conoboy has said elsewhere on this list
that they want to expand the kABI whitelist as much as practically
possible based on real-world data. We have a large repository of these
with ELRepo (among others) and we can use that for helping the RHEL
team expand the whitelist so that they don't break so often.

> It is really important that this process happens. If it didn't, we
> wouldn't for example get a new WiFi stack in RHEL8.1 backported from
> kernel-5.2 or in RHEL8.3 backported from kernel-5.7 and none of our
> fancy new WiFi adapters would work.
>

Yeah, I know. The work that's done there is extremely impressive and I
love that they do it.

> It's also really important that this process is being opened up if Red
> Hat want people (the community) outside of the Red Hat kernel
> development team to be able to contribute to it.
>

I'm pretty sure that judging by Mike and Brendan's comments that this
is absolutely part of the strategy here.

> So I'm absolutely not against it and nor do I want to prevent or stop it
> from happening. Quite the opposite - I am really looking forward to the
> day I can contribute simple fixes to the RHEL kernel rather than having
> to file a bug and wait months/years to see the incorporation of a simple
> upstream fix or have to open a support case and spend months dealing
> people that do not understand the issue. But above all I just want
> people to recognise that this is a *development* system and stop trying
> to tell people that is it a drop in replacement for CentOS Linux because
> it is not.
>

In the strictest sense, it obviously is not. But in a very real
practical sense, it absolutely is. Aside from the kernel issues (which
I firmly believe are solvable), people are generally not going to
notice a difference between CentOS Linux 8 and CentOS Stream 8.

My CentOS Linux 8 boxes were replaced with CentOS Stream 8 back in the
spring because it was strictly better for production *and*
development. I've been in the process of opportunistically switching
our build targets from CentOS Linux 8 to CentOS Stream 8 most of the
year. With the retirement of CentOS Linux 8, it now becomes more of a
priority, but it was already going to happen.

And for the couple of third-party kernel modules we use, our build
system already triggers automatic rebuilds when new kernel builds are
released. Heck, I support *Fedora* just fine because of that. So the
kernel thing has been a non-issue for me. However, I recognize that
most people haven't put in the engineering effort *I* have to support
that kind of thing, and Red Hat's kABI promises are intended to make
it so people don't have to. So we need to improve processes around
that to make kernel updates less painful for third party kernel module
maintainers, since that's a huge part of the value of the RHEL/CentOS
ecosystem.

CentOS Stream 8 may suck a bit for that *now*, but it doesn't have to
remain that way. In the future, it could make both RHEL and CentOS
even better in this regard if we take advantage of this opportunity.




--
真実はいつも一つ!/ Always, there's only one truth!