[CentOS] how to debug random server reboots

Wed Jun 3 17:23:52 UTC 2009
Scott Silva <ssilva at sgvwater.com>

on 6-2-2009 11:53 PM Rudi Ahlers spake the following:
> On 6/3/09, Scott Silva <ssilva at sgvwater.com> wrote:
>> on 6-2-2009 2:46 PM Rudi Ahlers spake the following:
>>> On 6/2/09, Scott Silva <ssilva-m4n3GYAQT2lWk0Htik3J/w at public.gmane.org> wrote:
>>>> on 6-2-2009 2:30 PM Rudi Ahlers spake the following:
>>>>> Hi all,
>>>>>
>>>>> One of our CentOS 5.3 randomly reboots, at different times of the day,
>>>>> and I can't see why it's doing it.
>>>>>
>>>>> I have looked through the logs, but don't see any thing in there that
>>>>> shows me why it has rebooted. How can I debug this?
>>>>>
>>>>> Here's a snipped from the log, around the time of the reboot:
>>>>>
>>>>>
>>>> <snip>
>>>> Random reboots can happen fast enough that nothing gets into the logs.
>>>> You
>>>> can
>>>> try setting up a console and have the system post there. It sometimes
>>>> catches
>>>> things.
>>>>
>>>> But until then I would do the obvious... Make sure the system is clean
>>>> and
>>>> not
>>>> overheating from "dust bunnies" filling up the chassis.
>>>>
>>>> Remove and re-seat all cards and ram. Make sure all fans are working. Run
>>>> memtest overnight if possible. Look back to when the reboots started and
>>>> see
>>>> if something was added or upgraded.
>>>>
>>>>
>>> Hi Scott, the server is in the USA, and I'm in ZA. I've been trying to
>>> get the IDC to look into the problem, but they're not very helpful and
>>> recon I need to check my software. I know the "server" runs desktop
>>> hardware, so it could be a hardware problem, but they don't seem to
>>> think so.
>>>
>>> So, I'm trying todo everything I can, from my side, via SSH to see if
>>> I can figure it out.
>>>
>> Will the data center hang a serial port monitor on it for a while? Many of
>> them will do it for free, or a few dollars a day, and give you remote access
>> into it. Is it your server, or a lease/rental?
>>
>>
>>
> 
> 
> It's a rented server from a 3rd party who feels that it's not their
> problem. Seems I need to get a new server, from someone else.
> 
> 
That might be best, if just to get a decent provider. If they aren't willing
to check it, they are a poor excuse for a service business. And the fact that
the system isn't functioning properly should be enough for you to get out of a
contract if you have one.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 258 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos/attachments/20090603/410b9233/attachment-0005.sig>