[CentOS] 7.2 kernel panic on boot

Fri Dec 4 16:35:11 UTC 2015
Lamar Owen <lowen at pari.edu>

On 12/04/2015 08:02 AM, mark wrote:
> No, *you* don't understand what we're saying: pre-systemd, if the o/p 
> saw that one stmt before the panic, they could look at what the system 
> was doing *sequentially*, and so have an idea what it was failing on. 
> With systemd's parallelism, we have no clue, other than what it's 
> done, and no idea what's happening that's failing.
>
It has never been true that a kernel panic was necessarily caused by the 
immediately preceding step in a sequential init.  I ran into one 
instance (back in 4.x days, incidentally, where 'x' was 1 or 2) where a 
panic was caused by the tg3 driver, but it wasn't tickled until a 
variable number of packets passed the interface, and it didn't happen 
very often.  Typically, when it happened it happened during ssh startup 
(almost every time it occurred, in fact).  But the root cause was the 
tg3 driver module, not sshd.  So having the last line before the panic 
being the ssh startup was actually a hindrance rather than a help in 
that case; I would have been looking for an sshd problem that didn't 
actually exist.

I don't think that's an isolated instance, either.  You need the module 
information from the panic more than information on what was started 
immediately prior to the panic.  This was fixed without me having to 
file a bug report, incidentally, and so there is no BZ # to point you to 
that I recall, and a quick search of bugzilla doesn't show one for that 
particular issue that I had.  I ended up seeing that it was a tg3 
problem after setting up a serial console and grabbing the panic output 
from that.  By the time I got to that point, the next update rollup for 
CentOS 4 was coming down, and that was the end of that problem.

I keep thinking I'll track down the panic I saw a few months ago with 
CentOS 7 and gkrellm on my hardware, but by the time I get enough 'round 
toits' to do the troubleshooting the kernel has been updated, and I have 
to wait on the debuginfo.....lather, rinse, repeat.  Eventually I'll get 
my timing right and see what is (or maybe is not) happening.