[CentOS] High load average, low CPU utilization
Matt Garman
matthew.garman at gmail.com
Fri Mar 28 15:04:46 UTC 2014
On Fri, Mar 28, 2014 at 9:37 AM, John R. Dennison <jrd at gerdesas.com> wrote:
>
> On Fri, Mar 28, 2014 at 09:30:17AM -0500, Matt Garman wrote:
> >
> > How can the loadavg shoot up (from ~1 to ~20) without a corresponding
> > uptick in number of tasks?
>
> loadavg is based on number of processes vying for cpu time on the runq; the
> number of over-all processes on the system is not really relevant unless
> they are all competing for cpu.
Is there a way to see this number of processes in the runq? From the
shell or programmatically?
> What's the i/o wait on the box when you see load spikes? If the box is
> i/o bound (indicated by high i/o) the load average will spike due to
> processes blocked on i/o cycles.
I ran "top -b" directed to a file and captured one of these spikes.
Here's a sample from the approximate start, peak, and end of the load
spike (respectively):
top - 18:40:29 up 14 days, 1:34, 4 users, load average: 0.80, 0.48, 0.29
Tasks: 205 total, 1 running, 204 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.2%us, 4.9%sy, 0.0%ni, 92.1%id, 0.0%wa, 0.1%hi, 1.7%si, 0.0%st
top - 19:16:00 up 14 days, 2:09, 4 users, load average: 19.67, 19.02, 15.75
Tasks: 203 total, 1 running, 202 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.1%us, 4.6%sy, 0.0%ni, 92.3%id, 0.0%wa, 0.2%hi, 1.9%si, 0.0%st
top - 20:20:27 up 14 days, 3:14, 4 users, load average: 0.93, 3.58, 8.69
Tasks: 212 total, 1 running, 211 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.2%us, 4.8%sy, 0.0%ni, 91.7%id, 0.6%wa, 0.1%hi, 1.6%si, 0.0%st
Looks like I collected 17277 total top samples. The max "%wa" over
this time was 61.1%, and less than 40 of those samples had "%wa" over
10.0. In other words, over many hours, the system had IOwait over 10%
for less than a minute. And note that my load spike lasts for almost
two hours.
More information about the CentOS
mailing list