[CentOS] how to debug random server reboots

Wed Jun 3 09:15:58 UTC 2009
Sergej Kandyla <sk.paix at gmail.com>

Rudi Ahlers пишет:
> Hi all,
>
> One of our CentOS 5.3 randomly reboots, at different times of the day,
> and I can't see why it's doing it.
>
> I have looked through the logs, but don't see any thing in there that
> shows me why it has rebooted. How can I debug this?
>
>   

Hi,

try to enable kdump to get kernel dump, if this software-related issue.

http://download.swsoft.com/virtuozzo/virtuozzo4.0/docs/en/lin/VzLinuxUG/20027.htm
Using Kexec and Kdump For System Troubleshooting

yum install kexec-tools
edit /etc/grub.conf and append to the end of the kernel line: 
"crashkernel=128M at 16M"
chkconfig kdump on
reboot

Also look this:

http://kbase.redhat.com/faq/docs/DOC-6039
How do I configure kexec/kdump on Red Hat Enterprise Linux 5?

http://kbase.redhat.com/faq/docs/DOC-2119
How can I voluntarily crash my machine to test if netdump/diskdump/kdump 
I configured works?

http://kbase.redhat.com/faq/docs/DOC-5413
My server crashes once in awhile. How can I debug it?

http://kbase.redhat.com/faq/docs/DOC-1742
My system has started to hang randomly. What information does Red Hats 
technical support need to diagnose the problem?

http://kbase.redhat.com/faq/docs/DOC-10828
My Red Hat Enterprise Linux 2.1 system had a kernel panic, an oops 
message, or is freezing for no apparent reason. How can I find out what 
is causing this?


Next, I recommend you setup and run
memtest86+.x86_64 : Stand-alone memory tester for x86 and x86-64 computers

You should ask the support to reboot machine for a night and chose the 
memtest in grub loader.
If DC has ipkvm - ask it.

Also what a network card on your server ?
I had some troubles with non-brand network card..



-- 
Best wishes, Sergej Kandyla
Всегда улыбайтесь жизни и жизнь всегда улыбнется вам!