Epilogue:
I've tried to disable TSO (ethtool -K eth0 tso off), as was suggested on the poweredge list. This did not help.
I've configured the machine to start with the 5.2 kernel in /boot/grub/grub.conf, changing the default. It has been running for 6 1/2 days, now. I would say that this helped and is what I would suggest to others experiencing the same problem, right now.
Thus, current running kernel is 2.6.18-92.1.10.el5xen.
Regards and thanks for all replies,
Peter
Peter Hopfgartner wrote:
Dear ML
We upgraded a Dell Poweredge PE 1950 Server the 8th of May. Since then the server rebooted 3 times without external cause (it is located in a server farm with redundant power supply etc.). Looking at the servers monitoring infrastructure with Dell's own OpenManage tools, I get strange errors:
[root@servernew ~]# omreport system esmlog
(....)
Severity : Critical Date and Time : Mon May 11 17:46:59 2009 Description : System Software event: run-time critical stop was asserted
Severity : Critical Date and Time : Fri May 15 21:07:57 2009 Description : System Software event: run-time critical stop was asserted
Severity : Critical Date and Time : Wed May 20 21:00:53 2009 Description : System Software event: run-time critical stop was asserted
(...)
This class of errors never happened before in over a year that the server is running.
There is no mention of any anomaly, except the boot messages itself, in /var/log/messages.
The server runs the 64 bit flavor of CentOS hosting some XEN virtual machines and some PostgreSQL and MySQL databases. It run without any issues with CentOS 5.1 and 5.2.
I interpreted these issues as some kernel/software related problem, but do not know how to make a more accurate diagnosis of the problem.
Can anybody give me some hint? Has anybody had some similar issue?
Regards,
Peter