On 1/18/06, Les Mikesell lesmikesell@gmail.com wrote:
On Wed, 2006-01-18 at 13:38, Fong Vang wrote:
I have a total of 20 CentOS 4.1 systems running on fairly new hardware. About 6 of them are experiencing strange hangs without any indication -- nothing in /var/log/messages nor on the console -- sometime within 10-30 minutes after a reboot. The systems still responds to ping but you can't ssh to it. At the console, you could type "root" at the user prompt but it hangs immediately after hitting enter.
Memory scan of all systems show no error.
Any idea how to troubleshoot this problem. The system's not responsive to do any troubleshooting and nothing abnormal is in the log.
We running htis kernel: kernel-smp-2.6.9-11.EL.i686.rpm.
My first guess would be that something is consuming all possible memory and pushing everything else into swap. The system may not be completely hung, but it can't respond in a reasonable amount of time. If the logs for whatever services you run don't show anything, I'd watch with top over a period of time to see if a single program is doing it and frequent "ps ax" check to see if a large number of small processes are accumulating. You can get a hint about how fast new processes are being started by looking at the process id of the ps process when you run it repeatedly. I assume from the fact that you have 20 boxes that you are doing something that causes substantial load - perhaps it needs to be distributed better.
These systems will be doing a lot once we turn on the service, but we're still in the setup mode.
So far, the only thing we've done is kicked these systems from the same image/profile. We've turned off all services with almost nothing running on them at all. That's what's baffling about this. The hang is so silent making it very difficult to trouble shoot (again, the system responds to ping. load avergage is normal. context switch is normal. swap is normal. network and io is normal.)
We'll have to look at the hardware next to determine if they are indeed the same.
-- Les Mikesell lesmikesell@gmail.com
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos