[CentOS] System Start-Up Issue

Mon Jul 3 15:17:22 UTC 2017
m.roth at 5-cent.us <m.roth at 5-cent.us>

Mark Haney wrote:
> On 07/03/2017 10:52 AM, m.roth at 5-cent.us wrote:
>> Chris Olson wrote:
>>>      On Monday, July 3, 2017 5:58 AM, "m.roth at 5-cent.us"
>>> <m.roth at 5-cent.us>
>>> wrote:
>>>   Chris Olson wrote:
>>> <snip>
>>>> I went on vacation right after an update to one of our virtual CentOS
>>>> 6.9 systems so it was not restarted for a period of time.  Now it will
>>>> not complete boot-up with the gnome display never fully launched.  A
>>>> progress bar at the bottom of the start-up screen never reaches
>>>> completion. We have not been able to detect a running system on the
network.
>>>>
>>>> Two options for stopping the CentOS 6.9 virtual machine have been
>>>> tried.
>>>> One is to "power off" and the other is to "send the shutdown message".
>>>> Both of these options appear to work properly.  The shutdown output
>>> <snip>
>>> Suggestion: boot to the previous kernel. If that works, reinstall the
>>> update, then reboot to it.
>>>
>>> We had real issues months back, where a yum-cron appeared to
>>> half-ignore the exclude=kernel line in yum.conf, and it would
>>> consistently fail to
>>> boot, but once the above was done, reinstalling the latest kernel,
>>> *then* it rebooted with no problem.
>
> Okay, stupid question, if yum-cron was jacked up months back are you
> still using it?  And if so, why?  Never in my life have I ever scheduled
> updates on any server for any reason.  Mostly because I don't trust it
> to do it right.  Also mostly because I use ansible to manage that, and
> that playbook is always manually run just in case there's an issue.
>
> But yeah, you might be hosed. If this is a VM, do you not have a
> snapshot handy?  (I know, I'm late to the party but was camping this
> weekend.

I think you're mixing up the OP and me. The issue for us seems to have
been fixed - I suspect it was an issue with yum - it was as though it
updated the kernel, but then didn't run the postinstall scripts. Haven't
had that happen in a while.

And no, not a VM. We only have a few VMs; most are bare metal. But then,
our servers are for scientific computing, and we need every bloody CPU
cycle. <g> I've got users who may be the only one on a server, or a server
with *two* Tesla cards, or a cluster of 23 or 24 servers, with over 1100
or 512 cores, whose jobs run, literally, for weeks.

      mark