[CentOS] Re: [OT] What is the best network monitoring tool?

Mon Oct 13 23:45:02 UTC 2008
nate <centos at linuxpowered.net>

Les Mikesell wrote:
> nate wrote:
>
>>
>> Last I checked as well the SNMP daemon didn't return cpu i/o
>> wait values, which is pretty handy to have.
>
> It must... I haven't waded through the details of how it does it, but a
> default OpenNMS install will collect and graph a CPU usage chart that
> stacks user/nice/wait/system/interrupts and seems accurate except that
> it is per-cpu (i.e. will go to 400% on a hyperthreaded dual-cpu box).

Strange since the FAQ for snmpd specifically says per-cpu
stats are not accurate.

What about multi-processor systems?
----------------------------------

    Sorry - the CPU statistics (both original percentages, and the
  newer raw statistics) both refer to the system as a whole.  There
  is currently no way to access individual statistics for a particular
  processor (except on Solaris systems - see below).

    Note that although the Host Resources table includes a hrProcessorTable,
  the current implementation suffers from two major flaws.  Firstly, it
  doesn't currently recognise the presence of multiple processors, and
  simply assumes that all systems have precisely one CPU.  Secondly, it
  doesn't calculate the hrProcessorLoad value correctly, and either returns
  a dummy value (based on the load average) or nothing at all.

    As of net-snmp version 5.1, the Solaris operating system delivers some
  information about multiple CPU's such as speed and type.

    Other than that, to monitor a multi-processor system, you're currently
  out of luck.  We hope to address this in a future release of the agent.
  But you've got the source, so you can always have a go yourself :-)

---

I'm not aware of any other tool that reports stats on a per-CPU
basis(e.g. sar, vmstat, etc) on a 2.6.x kernel, though per-cpu
stats were available in older 2.4.x kernels with SAR at least,
though I've always only been interested in cpu usage as a whole
rather than per-cpu stats. My main cacti server has 2555 graphs
as it is.

I have too many hundreds of hours invested in cacti right now
to make the jump to anything else at the moment..but perhaps
some day I will jump ship and use something else, or go back
to writing my own, which I used to do in order to get higher
resolution monitoring several years ago(e.g. 10,30,60 second
intervals). My cacti collects about 11 million data points a day
today with room on the hardware to probably go to 25 million
before needing a 2nd server(dual proc quad core 16GB).

nate