We are migrating HP servers from RHAS3 to centos 4.8. Since, the load average is reaching really high values. Under usage we see loads of 15-20 instead of the 0.5-1.5 we were used to. I searched a lot for information on such an issue. Iostat, vmstat, top, ps. But I don't get significant hint. Cpu usage remains low, IO wait remains low, no disks are lagging, no swap usage.
Are there some tool / other way to diagnose why the load average is high? Like which processes are waiting, where they are stuck. Are there calls to drivers or system process that are slow?
Thanks
On Thu, Jan 14, 2010 at 2:12 PM, fortin.pierre@bell.ca wrote:
We are migrating HP servers from RHAS3 to centos 4.8. Since, the load average is reaching really high values. Under usage we see loads of 15-20 instead of the 0.5-1.5 we were used to.
I searched a lot for information on such an issue. Iostat, vmstat, top, ps. But I don’t get significant hint.
Cpu usage remains low, IO wait remains low, no disks are lagging, no swap usage.
Are there some tool / other way to diagnose why the load average is high? Like which processes are waiting, where they are stuck. Are there calls to drivers or system process that are slow?
Are you by any chance now running a CPU throttling program and weren't before?
We are migrating HP servers from RHAS3 to centos 4.8. Since, the load average is reaching really high values. Under usage we see loads of 15-20 instead of the 0.5-1.5 we were used to.
I searched a lot for information on such an issue. Iostat, vmstat, top, ps. But I don't get significant hint.
Cpu usage remains low, IO wait remains low, no disks are lagging, no swap usage.
Are there some tool / other way to diagnose why the load average is high? Like which processes are waiting, where they are stuck. Are there calls to drivers or system process that are slow?
Are you by any chance now running a CPU throttling program and weren't >before?
ACPI is not enabled. Other than that I couldn't tell. Sorry if it sounds noob. What can I check to be sure nothing is throttling the CPU?
__________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Thu, Jan 14, 2010 at 3:02 PM, fortin.pierre@bell.ca wrote:
Are you by any chance now running a CPU throttling program and weren't >before?
ACPI is not enabled. Other than that I couldn't tell. Sorry if it sounds noob. What can I check to be sure nothing is throttling the CPU?
Check if you're running cpuspeed and perhaps cat out /proc/cpuinfo and look at the CPU MhZ entry. If it doesn't match the advertised speed of your CPU then you likely have some sort of throttling enabled.
Here's the best article I've found on the CPU speed and top output: http://www.linuxjournal.com/article/9001
In short, if you're throttling your CPU (this generally being a *good* thing, since it minimizes power usage but still provides the same response) then the load average may seem to be higher. But the real metric to watch is how many processes are waiting rather than the load stat from top.
On 1/14/2010 1:12 PM, fortin.pierre@bell.ca wrote:
We are migrating HP servers from RHAS3 to centos 4.8. Since, the load average is reaching really high values. Under usage we see loads of 15-20 instead of the 0.5-1.5 we were used to.
I searched a lot for information on such an issue. Iostat, vmstat, top, ps. But I don’t get significant hint.
Cpu usage remains low, IO wait remains low, no disks are lagging, no swap usage.
Are there some tool / other way to diagnose why the load average is high? Like which processes are waiting, where they are stuck. Are there calls to drivers or system process that are slow?
I don't have an answer but out of curiosity, why would you move to a 4.x instead of 5.x now?
You might be able to do some ps snapshots to see the process in R state, which is what the load average should be counting. That might be computed differently between the 2.4 and 2.6 kernels.
We are migrating HP servers from RHAS3 to centos 4.8. Since, the load average is reaching really high values. Under usage we see loads of 15-20 instead of the 0.5-1.5 we were used to.
I searched a lot for information on such an issue. Iostat, vmstat, top, ps. But I don't get significant hint.
Cpu usage remains low, IO wait remains low, no disks are lagging, no swap usage.
Are there some tool / other way to diagnose why the load average is high? Like which processes are waiting, where they are stuck. Are there calls to drivers or system process that are slow?
I don't have an answer but out of curiosity, why would you move to a 4.x instead of 5.x now?
You might be able to do some ps snapshots to see the process in R state, which is what the load average should be counting. That might be computed differently between the 2.4 and 2.6 kernels.
The purpose of the server is to run NMS Telephony cards. The only support is for Centos 4.x on 32 bit systems. Anyway, since I have not found the trouble, it may still be there with another centos version.
When I execute multiple ps during high load average period (above 10), here is the kind of output I have:
ps aux | grep " R" USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 16295 0.0 0.0 3492 772 pts/1 R+ 15:45 0:00 ps -aux root 16296 0.0 0.0 5400 648 pts/1 S+ 15:45 0:00 grep R
I see a lot of process in S, Sl, Ss+ and Ssl states.
On Thu, 2010-01-14 at 15:47 -0500, fortin.pierre@bell.ca wrote:
We are migrating HP servers from RHAS3 to centos 4.8. Since, the load average is reaching really high values. Under usage we see loads of 15-20 instead of the 0.5-1.5 we were used to.
I searched a lot for information on such an issue. Iostat, vmstat, top, ps. But I don't get significant hint.
--- Have you bothered to look at "lsof"? I find it really usefull...
John
----- Original Message ----
From: "fortin.pierre@bell.ca" fortin.pierre@bell.ca To: centos@centos.org Sent: Thu, January 14, 2010 5:47:43 PM Subject: Re: [CentOS] High load since passing from rhas3 to centos4.8
The purpose of the server is to run NMS Telephony cards. The only support is for Centos 4.x on 32 bit systems. Anyway, since I have not found the trouble, it may still be there with another centos version.
When I execute multiple ps during high load average period (above 10), here is the kind of output I have:
ps aux | grep " R" USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 16295 0.0 0.0 3492 772 pts/1 R+ 15:45 0:00 ps -aux root 16296 0.0 0.0 5400 648 pts/1 S+ 15:45 0:00 grep R
I see a lot of process in S, Sl, Ss+ and Ssl states.
first of all, if you have high load average with seemingly low utilization, it may be because of load imbalance between CPUs and/or short bursts of lots of short cpu-intensive processes.
Here's what I'd look at first:
run "vmstat 1 10" and look at the first column. if it's higher than 1/ncpu you're having cpu saturation.
rum "mpstat -P ALL 1 10", this gives you cpu utilisation per cpu (sar gives you the average among all processors)
Also take a look at "sar -q 1 10" to look at the CPU's queues sizes
Hope this helps
Fer