On Wed, 22 May 2019 at 09:30, mark m.roth@5-cent.us wrote:
Ok, we used to get this occasionally on cluster nodes, and we just got it on a fileserver (very bad). The system is discovered to be unresponsive: it doesn't ping, and plugging a console in, you can see that it's not dead, but there nothing at all on the screen, nor does it respond to even <ctrl-alt-del>. The only answer is to power cycle it; it comes up fine.
Nothing in /var/log/dmesg or /var/log/messages. No abrts I can find. sar tells me it went unredponsive between 18:10 and 10:20 yesterday. Note that there are no further entries in sar, either, for yesterday, after the event, and nothing till I power cycled it.
From the above description, I would normally say it sounds like hardware.
However, why do you say the system is not dead when you plug in a console.. but there is nothing on the screen and it doesn't respond to control-alt-delete. To me that sounds like 'dead'. Usually the cpu is hardlocked or the hardware went into 'over-heat' and put everything in a deep sleep hoping it would cool down but never wake up.
Has anyone else seen this - I can't imagine it's only us - or have any thoughts?
C 7, 7.6.1810
mark
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos