You should be able to recognize or monitor this by configure the syslog to print everything on a specific TTY or use the remote logging functionality.
Kind regards Thomas
Am Do., 23. Mai 2019 um 18:31 Uhr schrieb Jon Pruente < jpruente@riskanalytics.com>:
On Wed, May 22, 2019 at 10:02 AM mark m.roth@5-cent.us wrote:
That seems unlikely. Foe one, I've seen that... but I *always* see
entries
in the log about the oom-killer being invoked. For another, this isn't a compute node, it's *only* a fileserver, serving projects, home directories, and backups (home-grown b/u, uses rsync), and backups don't start until well after midnight, and as we're business-hours only, there was less usage, and it does have 256G RAM....
I have two servers that would lock up like this occasionally, and if I let them sit at the console long enough sometimes they would give a login prompt. It took a lot of time and frustration (these are prod servers) but I tracked it down to a problem in the XFS driver, as it never occurred on the systems with EXT4 filesystems. The XFS driver would hang, preventing writes to the filesystem. I could identify exactly when that happened as all system logging would suddenly stop at the same second. Then OOMKiller would come in and start killing off processes but that wouldn't be in the logs on disk because the file system couldn't write. I rolled the servers back to a 5xx series kernel and the issue didn't resurface. I recently let them boot the newer 9xx series kernels and I'm hoping the XFS issue is fixed. _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos