Check the hardware system health it could be that there is a faulty component that triggering to reboot or maybe high temperature (overheated) processor check your hardware fan if still working
On Monday, 30 May 2016, Keith Keller kkeller@wombat.san-francisco.ca.us wrote:
Hi Bill,
On 2016-05-30, Bill Gee <bgee@campercaver.net javascript:;> wrote:
By luck I saw the beginning of a reboot on the server console. Normally
I have
other systems up on the KVM switch. It appears to have dumped core. I
don't
know where to look for the core dump files. They are not in /root.
One place you might check is under /var/lib. I think there may be a /var/lib/crash directory which contains core dumps.
I ran MemTest 86+. No memory errors were found.
Another option is to try Advanced Cluster Breakin, which runs other tests besides memory.
http://www.advancedclustering.com/products/software/breakin/
I've had it find problems that memtest hasn't (and vice-versa).
Lm_sensors shows the processor running between 45 and 50C.
If the system supports IPMI, check those sensors and logs, there may be something useful there. If you don't have IPMI, there may still be something in the BIOS logs (how you get to those varies wildly, you may need to boot into the BIOS to do it).
I hope that helps!
--keith
-- kkeller@wombat.san-francisco.ca.us javascript:;
CentOS mailing list CentOS@centos.org javascript:; https://lists.centos.org/mailman/listinfo/centos