[CentOS] kpatch (live kernel patching) in CentOS 7.7?

Fri Oct 4 12:17:56 UTC 2019
Phelps, Matthew <mphelps at cfa.harvard.edu>

On Fri, Oct 4, 2019 at 6:33 AM Jim Perrin <jperrin at centos.org> wrote:

>
>
> On 10/3/19 9:35 PM, Stephen John Smoogen wrote:
> > On Thu, 3 Oct 2019 at 13:52, Phelps, Matthew <mphelps at cfa.harvard.edu>
> wrote:
> >>
> >> On Thu, Oct 3, 2019 at 1:42 PM Jim Perrin <jperrin at centos.org> wrote:
> >>
> >>>
> >>>
> >>> On 10/3/19 1:32 PM, Phelps, Matthew wrote:
> >>>> Forgive me if this has been answered before and I've missed it.
> >>>>
> >>>> This https://access.redhat.com/solutions/2206511 says live kernel
> >>> patches
> >>>> will be available via yum updates as of RHEL 7.7. Is this carried
> over to
> >>>> CentOS 7.7.1908?
> >>>>
> >>>
> >>> The functionality should be available, but we don't provide patches in
> >>> this way, no.
> >
> >>
> >> What would it take to make this happen? This would be a huge help to
> those
> >> of us running servers. Not to mention it would make the world a more
> secure
> >> place :)
> >>
>
> The short answer is "a team of kernel engineers, which we don't have".
> Smooge's overview which I've left below is great at explaining some of
> this:
>
>
I don't understand. If RHEL is putting out patches, and CentOS is a
recompile of RHEL, hasn't that "team of kernel engineers " already done the
work?

I fully realize this is not a panacea for never rebooting again, but if we
can patch a critical kernel bug immediately, then schedule less disruptive
reboots in a week or three, this would help tremendously.

>> Is it an upstream issue? No SRPMS available? Etc?
> >>
>
> It's quite a bit more work than just SRPM (re) building. This is one of
> those things where if your workflow requires this functionality rather
> than the occasional reboot you should really just pay for RHEL. They put
> far more people and testing behind this feature than the team building
> CentOS is able to.
>
> (DISCLAIMER: I work for RH, so that may not sound as true as it is)
>
>
I knew someone was going to say that. :) In our case, as I'm sure is the
case for many other environments, we are a noncommercial CentOS shop that
can't afford the resources to have a mixed environment, not to mention the
RHEL licenses. Not all of the machines I'm thinking of are critical
infrastructure. We have many researchers running simulations that take
weeks, sometimes months, to finish, and avoiding the occasional forced
immediate reboot for a critical kernel bug would help expand Human
Knowledge :).

Anyway, I saw the functionality for live kernel patching in the RHEL 7.7
release notes, which the CentOS 7.7.1908 release notes pointed to, and
assumed (hoped?) that it would be available for us as well. If it won't
ever be provided, then I suggest the CentOS documentation be updated to
explicitly state so.


>
> >> Just trying to understand. I don't follow the centos-devel list. Has
> this
> >> been discussed there, or elsewhere?
> >>
> >
> > There is a lot to go into making a correct kpatch. You have to
> > determine that you have a working kpatch (you can have one which works
> > on 1% and corrupts 80% and crashes 19%), you have to determine that
> > the patch fixes the problem (you can build patches which should do the
> > right thing but don't), and you have to determine that it doesn't add
> > in some sort of long term corruption of memory/disk/etc. That takes
> > specialized kernel expertise, a large amount of varied hardware to
> > test the patch on, some amount of time, and a very large test suite.
> >
> > You can also only live patch a system so many times and in only
> > certain places. There are just some parts of the kernel which have to
> > be rebooted and others you can put in a patch which works but your
> > performance is going to be 25% of what it was before. There are other
> > places that if you patch.. that is it.. try another and you hardlock.
> > As much as some sites like to call it some sort of panacea for never
> > having to reboot again.. it is really meant to be a tourniquet to air
> > chopter the crash victim to a hospital. They may still not make it...
> > you are just giving them a chance.
> >
> >
> >
>
> --
> Jim Perrin
> The CentOS Project | http://www.centos.org
> twitter: @BitIntegrity | GPG Key: FA09AD77
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos
>


-- 

*Matt Phelps*

*Information Technology Specialist, Systems Administrator*

(Computation Facility, Smithsonian Astrophysical Observatory)

Center for Astrophysics | Harvard & Smithsonian


60 Garden Street | MS 39 | Cambridge, MA 02138
email: mphelps at cfa.harvard.edu


cfa.harvard.edu | Facebook <http://cfa.harvard.edu/facebook> | Twitter
<http://cfa.harvard.edu/twitter> | YouTube <http://cfa.harvard.edu/youtube>
| Newsletter <http://cfa.harvard.edu/newsletter>