Our very large and very complex application is running on CentOS 3.5 and Solaris. The CentOS machines are hanging about once per day. By "hanging" I mean the system is completely unresponsive either from the console or remotely via ssh. Totally locked.
Does anyone have any tips for troubleshooting this sort of problem?
-Mark
On Wed, 2006-05-31 at 17:42 -0400, Mark Belanger wrote:
Our very large and very complex application is running on CentOS 3.5 and Solaris. The CentOS machines are hanging about once per day. By "hanging" I mean the system is completely unresponsive either from the console or remotely via ssh. Totally locked.
If you have a large amount of swap space configured you may be consuming enough RAM that the swap activity never catches up with what the system needs.
Does anyone have any tips for troubleshooting this sort of problem?
Let 'top' run on the console so you can see what was going on when it stopped. Also, install sysstat if you haven't already and run 'sar -A' to see the saved periodic snapshots of system resource use. You might also enable snmp and run something like cacti to monitor and graph cpu and memory use to see if you can find a pattern to the activity (like a memory leak in your application).