[CentOS] Re: Reboots -- Reboot Logic ...

Bryan J. Smith <b.j.smith@ieee.org>

thebs413 at earthlink.net
Thu Jun 2 17:34:15 UTC 2005


The main reason I see this topic come up over and over is because people
who are new to UNIX coming from Windows don't realize that they have
been "programmed" into thinking what reboots are for.

Prior to the proliferation of Windows, barring hardware failure, reboots
were pretty limited to scheduled maintenance.  Sure, user-space programs
have memory leaks, new options are buggy, etc...  And sometimes you
have to stop, start, debug, etc... those services.

But for the most part, the core, kernel in just about any UNIX system
has not only been fairly devoid of major issues because of the inherit
attitude of its developers.  _Nothing_ goes into the kernel that is not
required, and that means _not_ putting in services for performance or
other hacks.  So that means that memory leaks and other things that
are very problematic in newer services or changes are typically left
to only user-space services which can be fully pre-emptive.

Even NT 3.1 put things into kernel space that should have never been
in there.  It got worse with NT 3.51 "Daytona" for "Chicago" (DOS7.0
aka "Windows 95") compatibility, and NT 4.0 "Cario" was the ultimate
bastard prior until NT 5.1 (XP/2003).

In the UNIX world, you can stop all sorts of user-spaces -- pretty much
all of them if you wish -- which many admins do to resolve issues.  Take
down nmb/smb, take down ldapd, etc... to deal with issues.  The system
usually stays up and continues servicing many other capabilities --
because they are not part of the kernel space itself.

This is very much unlike the NT world.  The server, spooler and
other services are tied into the kernel.  Restarting them is typically
impossible or ineffective when there is an issue, because the issue
is in the kernel, and the user-space component restart doesn't help.
Then there are the "Chicago" components that have been tacked on
since NT 3.51, 4.0 and, probably most deadly, in NT 5.1 (XP/2003).

These services and libraries -- designed for the "free for all" of
"Chicago" (including MS IE) -- wreak havoc on the protected NT kernel's
space.

KEY POINT:  

This is why you should _never_ reboot UNIX/Linux in the hope that it
solves the problem like it typically does Windows.

In the Windows world, it typically fixes the issue -- even if temporary --
because even the user space components typically have kernel-space
services for performance, etc... -- e.g., IIS, SMB, etc...  Although
Microsoft seems to be putting IIS and other things into the "Indigo
.NET Sandbox" in NT6.0 Longhorn, the ADS, SMB and other non-.NET
services are still going to be 100% Legacy Win32+Chicago code.
So it's not getting any better anytime soon, and reboots will still be
required to "clean things out."

In the UNIX world, if you reboot, you are typically presented with _no_
resolution.  You're best bet is to find the root cause and fix it, instead
of wasting time rebooting.  Plus, there is also the additional
consideration of additional issues a UNIX reboot can introduce.

E.g., UNIX/Linux systems are rebooted so infrequently that many
boot-time configurations can change but not be known for months,
or even years!  So it's best to _only_ reboot UNIX/Linux when you
have time for scheduled maintenance.

So if there is one thing I highly deter new UNIX/Linux administrators
from doing, it is doing an "impulsive reboot."  It's better to just work
the problem, then to reboot, waste time (in the best case) and
possibly introduce even more, unforseen issues in changes (in the
worst case).



--
Bryan J. Smith   mailto:b.j.smith at ieee.org




More information about the CentOS mailing list