[CentOS] CPU usage over estimated?

nate centos at linuxpowered.net
Fri Jun 5 00:16:02 UTC 2009


Les Mikesell wrote:
> Scott Silva wrote:
>> on 6-4-2009 2:14 PM Les Mikesell spake the following:
>>> Scott Silva wrote:
>>>> on 6-4-2009 5:37 AM Theo Band spake the following:
>>>>> I have a quad core CPU running Centos5.
>>>>>
>>>>> When I use top, I see that running processes use 245% instead of 100%.
>>>>> If I use gkrellm, I just see one core being used 100%.
>>>>>
>>>> This one is easy. 4 cpu's, 100% total each, a maximum of 400%.
>>>>
>>>> Since one core is at 100%, the other 145% is spread across the other 3
>>>> cores.
>>> Is there any reasonable way to figure out the available CPU capacity
>>> from an SNMP monitoring tool?  (You want to know if the reported >100%
>>> usage is a problem but you don't know anything else about the machine).
>>>
>> That can be difficult, because a machine in I/O wait can be slower than a
>> machine at full CPU utilization. There is nothing technically wrong with a
>> machine at 100% cpu. It is just means that the cpu is busy doing useful
>> tasks,
>> instead of sitting idle doing nothing.
>> Where it is more critical is in a system that has occasional peaks of
>> load. If
>> the system is already busy, then these tasks will wait. Unless your system
>> idles down and lowers the cpu freq. to save power, it isn't really saving
>> anything by being idle. As long as the system gets its work done in a
>> timely
>> manner, then it isn't overloaded.
>
> SNMP does a reasonable job of reporting user/system/iowait.  That's not
> so much the question as how to know how many CPU's some machine has so
> you can know whether 400% is all of your capacity. That is, how many
> CPUs it has, since it doesn't scale the percentage against the total for
> you.

The internal CPU usage stuff that SNMP on linux provides is worse
than worthless as it provides incorrect data in many cases, from
the FAQ -

What about multi-processor systems?
----------------------------------

    Sorry - the CPU statistics (both original percentages, and the
  newer raw statistics) both refer to the system as a whole.  There
  is currently no way to access individual statistics for a particular
  processor (except on Solaris systems - see below).

    Note that although the Host Resources table includes a hrProcessorTable,
  the current implementation suffers from two major flaws.  Firstly, it
  doesn't currently recognise the presence of multiple processors, and
  simply assumes that all systems have precisely one CPU.  Secondly, it
  doesn't calculate the hrProcessorLoad value correctly, and either returns
  a dummy value (based on the load average) or nothing at all.

    As of net-snmp version 5.1, the Solaris operating system delivers some
  information about multiple CPU's such as speed and type.

    Other than that, to monitor a multi-processor system, you're currently
  out of luck.  We hope to address this in a future release of the agent.


---

I wrote a few scripts that get CPU usage and feed it into SNMP for
retrieval for my cacti systems.

My company used to rely on the built in linux SNMP stuff for cpu
usage(before I was hired) and they complained how it always seemed
to max out at 50%(on a dual cpu system).

I've been using my own methods of CPU usage extraction using sar
for about 6 years now and it works great, only downside is sar
keeps being re-written and with every revision they make it harder
and harder to parse it(RHEL 3 was the easiest by far).

Sample graph -
http://portal.aphroland.org/~aphro/cacti-cpu.png

That particular cacti server is collecting roughly 20 million data
points daily(14,500/minute). *Heavily* customized for higher
scalability.

nate




More information about the CentOS mailing list