[CentOS] When should I reboot?

On Apr 13, 2019, at 2:32 AM, Kenneth Porter <shiva at sewingwitch.com> wrote:
> 
> I reboot when I yum update to a new kernel or systemd, which seems to come out about once a month.

You can use similar logic as in Tony Mountfield’s answer to put off reboots in those cases as well.

If the reason for the kernel update is a bug in a Realtek NIC driver but your systems all use Intel NICs, you don’t need to reboot.

Let’s get concrete.  Just a few days ago, this CVE was filed against the Linux kernel:

   https://nvd.nist.gov/vuln/detail/CVE-2019-11191

I assume CentOS doesn’t ship any a.out binaries, so this bug is of no consequence to most CentOS systems.  For it to matter to your systems, your threat model must either allow:

1. Arbitrary code upload by someone with root privileges so they can setuid a newly uploaded a.out binary.  The only way such a situation is not already Game Over would be something like a VPS host where there are multiple “root” privilege levels.  If you’re not running such a hosting service, you probably don’t care about this bug.

2. Local staff to create a.out binaries and setuid them.  But why would that happen?  That’s two very uncommon conditions back to back.  On top of that, the threat model then must include the ability for your attacker to run one of these binaries; if the threat model is network outsiders only and these are not network services, the bug *still* doesn’t affect you.

Now let’s take systemd.

Systemd isn’t a single binary, and most of those binaries don’t run continuously.  (On a near-stock CentOS 7 VM I have here, only 5 of the 41 programs under bin/ in the systemd RPM are running right now.)  If the systemd component being updated doesn’t run continuously or can safely be restarted individually, you don’t need to reboot.  The component might not be running at upgrade time, or it might be easily restarted if it is running.

The glibc updates can also be put off, depending on the bug in question and the system’s threat model.  If you deem that the only threats worth responding to are those from the network, with everything internal to the server being deemed “good,” then the questions become “What’s listening to the network, can it/they be restarted, and which ones use affected glibc facilities?”

Let’s take a recent glibc CVE as an example:

    https://nvd.nist.gov/vuln/detail/CVE-2019-9169

If your network listening services aren’t doing case-insensitive POSIX regex matches, this bug cannot affect them, so under our stated threat model, the network services don’t need to be restarted, much less the whole system.

If you have network-listening services that *are* doing case-insensitive POSIX regex matches, then I assume the bug must only be happening with *particular* regexes, else we’d have learned of this bug decades ago, so your threat model must also allow the attacker to provide the regex.  That excludes, for example, regexes in your Apache configuration file, unless you’re running a shared web hosting service and allow arbitrary changes to the Apache config.

> I know the glibc update was mainly to handle the new Japanese calendar

Not “mainly,” “solely.”

> So my question is more about how shared libraries work and whether anything bad would happen with different forks of running services (mainly the mail suite with dovecot and the various content scanners launched by sendmail) running different versions of the library based on when they were started. 

As Tony said, each running binary continues with its prior copy of glibc and newly launched binaries get the new one.

Unless you’ve got a set of binaries that can end up with different glibc underpinnings and they are passing around Japanese date strings with the assumption that they agree on their interpretation, I can’t see how this can affect you.

Every CVE does not affect everybody, but Red Hat has to respond to most[*] of those affecting the code bases behind the binaries they build, ship, and support, because chances are, not doing so will affect some nontrivial subset of their user base.  

But whether the bug affects *you* is a wholly different question.  That’s why they publish these advisories, and why those advisories often link to further information.  You have to be willing and able to absorb and analyze this information if you don’t want to fall back on generic advice like “Reboot on every kernel, glibc, or systemd update.”

On the other hand, maybe you have good reason to reboot once a month or so anyway, and all of this provides a convenient excuse: “Because security.”  You might be subject to an uptime SLA that excludes security reboots, which let you slip other maintenance downtime into those reboot windows.

[*] I assume there are conditions that would lead Red Hat to ignore a CVE that does affect code it ships, but I have no ready examples.  If it happens, I trust their judgement.