[CentOS] Random server reboot after update to CentOS 5.3
Ross Walker
rswwalker at gmail.com
Mon Jun 8 15:26:44 UTC 2009
On Jun 8, 2009, at 9:18 AM, Peter Hopfgartner <peter.hopfgartner at r3-gis.com
> wrote:
> Scott Silva wrote:
>> on 6-3-2009 2:27 AM Peter Hopfgartner spake the following:
>>
>>> Epilogue:
>>>
>>> I've tried to disable TSO (ethtool -K eth0 tso off), as was
>>> suggested on
>>> the poweredge list. This did not help.
>>>
>>> I've configured the machine to start with the 5.2 kernel in
>>> /boot/grub/grub.conf, changing the default. It has been running
>>> for 6
>>> 1/2 days, now. I would say that this helped and is what I would
>>> suggest
>>> to others experiencing the same problem, right now.
>>>
>>> Thus, current running kernel is 2.6.18-92.1.10.el5xen.
>>>
>>> Regards and thanks for all replies,
>>>
>>> Peter
>>>
>>>
>> That sure points to a machine/kernel conflict. You could try
>> getting the
>> source and rebuilding to see if that solves it, or maybe a diff of
>> the two
>> kernel configs to see if something is different there. Maybe
>> someting is added
>> or turned on in the new kernel that your system doesn't like.
>>
>> Also, make sure your systems bioses are up to date. Not just
>> motherboard, but
>> any other cards that have firmware that might have an update like
>> raidcard/sas
>> controllers or ???
>>
>>
>> ---
>> ---------------------------------------------------------------------
>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
> Dear Scott,
>
> unfortunately the machine is in production. Any downtime is really a
> problem since it is seen directly by our customers. I would really
> like
> to do some active effort to isolate the problem, but my boss would cut
> my head off, if I have to stop the machine. The firmware is not
> current,
> but according to Dell's web site I should stop almost every running
> service on the machine before upgrading the firmware, and in this
> case I
> would again have to watch out for my head. I do really care to provide
> accurate bug reports to OS projects that I use (I would guess that
> 90 %
> of my reports lead to a quick fix), but in this case I do have to make
> an exception and keep the machine running.
Do what works for now and think about a test box or VM setup for the
future where you can test newer kernels before they go into production.
-Ross
More information about the CentOS
mailing list