Hi all,
Currently I have a big question.
What is the best OPEN SOURCE solution for monitoring multiple Host and Services, for example for using in a WebHosting Provider with 50 hosts or more.
Currently, I'm using NAGIOS for more than 3 years, this is a incredible tool, but before upgrade to version 3, I have this existential question.
Please fill free to recommend any soft, but with this option in mind. 1- Easy option to personalize any plug-in 2- Good Documentation 3- Support large platforms 4- Run over Linux
Sorry for me English Regards Alejandro www.linuxiso.com.ar from Buenos Aires, Argentina
Alejandro wrote:
Hi all,
Currently I have a big question.
What is the best OPEN SOURCE solution for monitoring multiple Host and Services, for example for using in a WebHosting Provider with 50 hosts or more.
What exactly are you interested in monitoring? Different tools have different uses. If your used to nagios and your environment isn't that big perhaps you should take a peek at groundworks -
http://www.groundworkopensource.com/community/community-edition.html
I use a combination of nagios and cacti, nagios for event based monitoring and cacti for performance monitoring/trending. My cacti is *heavily* customized the result of hundreds of hours of work and monitors roughly 11 million data points a day. Nagios monitors about half a million.
But depending on what exactly your monitoring will depend on what tool is best. e.g. how complex of monitors are you needing. Do you just need PING and basic HTTP checks or are you looking into more complex application level monitoring?
nate
Nate,
Thanks for your response.
Currently I have a lot of services to monitoring for example:
Databases : Oracle - Mysql - MSSql WebServers: SunWebServer - SunJavaAppServer - Apache Systems: Windows - Linux Networking: Cisco Switch - Lan Interfaces
Currently I look these projects: Groundwork and Centreon ( http://www.centreon.com/), this two tools based in Nagios, but with statistical tools
And other excellent tool is JFFNMS (http://www.jffnms.org/)
Regards, Alejandro
2008/10/8 nate centos@linuxpowered.net
Alejandro wrote:
Hi all,
Currently I have a big question.
What is the best OPEN SOURCE solution for monitoring multiple Host and Services, for example for using in a WebHosting Provider with 50 hosts or more.
What exactly are you interested in monitoring? Different tools have different uses. If your used to nagios and your environment isn't that big perhaps you should take a peek at groundworks -
http://www.groundworkopensource.com/community/community-edition.html
I use a combination of nagios and cacti, nagios for event based monitoring and cacti for performance monitoring/trending. My cacti is *heavily* customized the result of hundreds of hours of work and monitors roughly 11 million data points a day. Nagios monitors about half a million.
But depending on what exactly your monitoring will depend on what tool is best. e.g. how complex of monitors are you needing. Do you just need PING and basic HTTP checks or are you looking into more complex application level monitoring?
nate
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Alejandro schrieb:
Nate,
Thanks for your response.
Currently I have a lot of services to monitoring for example:
Databases : Oracle - Mysql - MSSql WebServers: SunWebServer - SunJavaAppServer - Apache Systems: Windows - Linux Networking: Cisco Switch - Lan Interfaces
Currently I look these projects: Groundwork and Centreon (http://www.centreon.com/), this two tools based in Nagios, but with statistical tools
And other excellent tool is JFFNMS (http://www.jffnms.org/)
I installed JFFNMS - and I hated it right from the start - the interface was barely usable even with 1920x1200.
Check out www.zabbix.com It needs an agent, but with the agent, a lot of stuff is autodetected (drives etc - in Nagios, you've got to do a lot of manual work). I still like Nagios for its flexibility, though.
Rainer
Alejandro wrote:
Nate,
Thanks for your response.
Currently I have a lot of services to monitoring for example:
Databases : Oracle - Mysql - MSSql WebServers: SunWebServer - SunJavaAppServer - Apache Systems: Windows - Linux Networking: Cisco Switch - Lan Interfaces
Currently I look these projects: Groundwork and Centreon (http://www.centreon.com/), this two tools based in Nagios, but with statistical tools
And other excellent tool is JFFNMS (http://www.jffnms.org/)
Don't forget OpenNMS - http://www.opennms.org.
Les Mikesell wrote:
Alejandro wrote:
Nate,
Thanks for your response.
Currently I have a lot of services to monitoring for example:
Databases : Oracle - Mysql - MSSql WebServers: SunWebServer - SunJavaAppServer - Apache Systems: Windows - Linux Networking: Cisco Switch - Lan Interfaces
Currently I look these projects: Groundwork and Centreon (http://www.centreon.com/), this two tools based in Nagios, but with statistical tools
And other excellent tool is JFFNMS (http://www.jffnms.org/)
Don't forget OpenNMS - http://www.opennms.org.
Hi,
I also use Nagios for monitoring events, and I'm quite happy with it, but I'm looking for a complementary tool for performance/trend monitoring, since I'm not very happy with the Nagios Addons like NagiosGrapher, PNP and the like (limited functionality, high installation effort).
I've also taken a look at Cacti, but it seems not very mature yet and integration into Nagios is also rather tedious if you don't want two separate solutions doing things twice.
Recently I stumbled over Munin (http://sourceforge.net/projects/munin/), which looks interesting, but I didn't have the time to take a look at it yet.
Anybody using Nagios and Munin together and want's to share some experience? :-)
best regards, __ /homas
Hi,
For this same reason I look in centreon, is a nagios backend with a excelent integration with performace graphics all in one unique system
Regards www.linuxiso.com.ar
2008/10/12, Thomas Bleier tbleier2@gmail.com:
Les Mikesell wrote:
Alejandro wrote:
Nate,
Thanks for your response.
Currently I have a lot of services to monitoring for example:
Databases : Oracle - Mysql - MSSql WebServers: SunWebServer - SunJavaAppServer - Apache Systems: Windows - Linux Networking: Cisco Switch - Lan Interfaces
Currently I look these projects: Groundwork and Centreon (http://www.centreon.com/), this two tools based in Nagios, but with statistical tools
And other excellent tool is JFFNMS (http://www.jffnms.org/)
Don't forget OpenNMS - http://www.opennms.org.
Hi,
I also use Nagios for monitoring events, and I'm quite happy with it, but I'm looking for a complementary tool for performance/trend monitoring, since I'm not very happy with the Nagios Addons like NagiosGrapher, PNP and the like (limited functionality, high installation effort).
I've also taken a look at Cacti, but it seems not very mature yet and integration into Nagios is also rather tedious if you don't want two separate solutions doing things twice.
Recently I stumbled over Munin (http://sourceforge.net/projects/munin/), which looks interesting, but I didn't have the time to take a look at it yet.
Anybody using Nagios and Munin together and want's to share some experience? :-)
best regards, __ /homas _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Has anybody tried zenoss? is it worth working with?
i'm looking for something that is agent-less and not java ( i want to keep it relatively light weight). anything else out there? --
Best Regards,
Ivan Levchenko levchenko.i@gmail.com
i'm looking for something that is agent-less and not java ( i want to keep it relatively light weight). anything else out there?
I haven't been keeping up with this thread, so I apologize for repeating any other suggestions, but have you looked at:
1) Cacti (http://www.cacti.net/) 2) Centreon (http://www.centreon.com/)
Cacti has plenty of 3rd party plugins if you want, or just use SNMP in it's default form. And Centreon, like Nagios, doesn't require you to use agents if you don't want.
Regards, Kenneth Price
On Mon, Oct 13, 2008 at 8:12 PM, Kenneth Price kprice@nowyouknow.net wrote:
i'm looking for something that is agent-less and not java ( i want to keep it relatively light weight). anything else out there?
I haven't been keeping up with this thread, so I apologize for repeating any other suggestions, but have you looked at:
- Cacti (http://www.cacti.net/)
- Centreon (http://www.centreon.com/)
Cacti has plenty of 3rd party plugins if you want, or just use SNMP in it's default form. And Centreon, like Nagios, doesn't require you to use agents if you don't want.
Regards, Kenneth Price _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
I'm already using cacti, but its not very usable, especially the notifications (or lack of them) and either i didn't find a way, or cacti cannot monitor services (http, smtp, others....)
--
Best Regards,
Ivan Levchenko levchenko.i@gmail.com
Ivan Levchenko wrote:
I'm already using cacti, but its not very usable, especially the notifications (or lack of them) and either i didn't find a way, or cacti cannot monitor services (http, smtp, others....)
Nagios is for alerts and notifications, Cacti is for graphing trends. While either can be forced into the other role, each is best at it does natively.
On Mon, Oct 13, 2008 at 9:21 PM, John R Pierce pierce@hogranch.com wrote:
Ivan Levchenko wrote:
I'm already using cacti, but its not very usable, especially the notifications (or lack of them) and either i didn't find a way, or cacti cannot monitor services (http, smtp, others....)
Nagios is for alerts and notifications, Cacti is for graphing trends. While either can be forced into the other role, each is best at it does natively.
True, but i'm not into installing agents.. just lazy about it probably.. i know that snmp supports everything that I need, and I just do not want to install something that requires an agent, when it can be done without it...
Ivan Levchenko wrote:
True, but i'm not into installing agents.. just lazy about it probably.. i know that snmp supports everything that I need, and I just do not want to install something that requires an agent, when it can be done without it...
well, SNMP isn't going to tell you much about the state of a webserver or databbase statistics or whatever, its mostly just network information.
John R Pierce wrote:
well, SNMP isn't going to tell you much about the state of a webserver or databbase statistics or whatever, its mostly just network information.
It can tell you just about anything depending on how you configure it, SNMP is just a protocol. I get DB stats, system stats etc from SNMPD by hooking in custom OIDs to external scripts/files.
nate
nate wrote:
John R Pierce wrote:
well, SNMP isn't going to tell you much about the state of a webserver or databbase statistics or whatever, its mostly just network information.
It can tell you just about anything depending on how you configure it, SNMP is just a protocol. I get DB stats, system stats etc from SNMPD by hooking in custom OIDs to external scripts/files.
and thats different than configuring remote agents ?
John R Pierce wrote:
well, SNMP isn't going to tell you much about the state of a webserver or databbase statistics or whatever, its mostly just network information.
It can tell you just about anything depending on how you configure it, SNMP is just a protocol. I get DB stats, system stats etc from SNMPD by hooking in custom OIDs to external scripts/files.
and thats different than configuring remote agents ?
Remote agents for anything but snmp protocol tend to be specific for a particular monitoring tool and often available for only one or a limited number of platforms. Snmp is built into most network-capable devices.
On Mon, Oct 13, 2008 at 11:15 PM, Les Mikesell lesmikesell@gmail.com wrote:
John R Pierce wrote:
well, SNMP isn't going to tell you much about the state of a webserver or databbase statistics or whatever, its mostly just network information.
It can tell you just about anything depending on how you configure it, SNMP is just a protocol. I get DB stats, system stats etc from SNMPD by hooking in custom OIDs to external scripts/files.
and thats different than configuring remote agents ?
Remote agents for anything but snmp protocol tend to be specific for a particular monitoring tool and often available for only one or a limited number of platforms. Snmp is built into most network-capable devices.
exactly the way i'm thinking about it. this gives you the option to switch between any monitoring tool that properly supports snmp and i will not have to change anything on the client side. plus, as far as i know, snmp gives disk usage info, and, afaik, also running processes. for now, that's enough for me.
Ivan Levchenko wrote:
exactly the way i'm thinking about it. this gives you the option to switch between any monitoring tool that properly supports snmp and i will not have to change anything on the client side. plus, as far as i know, snmp gives disk usage info, and, afaik, also running processes. for now, that's enough for me.
Just remember that at least on linux, the cpu usage information presented by SNMPD is wildly inaccurate, so don't rely on it for that particular stat. I wrote my own little scripts to feed into snmpd to get cpu usage and associated templates etc for cacti.
nate
nate wrote:
Ivan Levchenko wrote:
exactly the way i'm thinking about it. this gives you the option to switch between any monitoring tool that properly supports snmp and i will not have to change anything on the client side. plus, as far as i know, snmp gives disk usage info, and, afaik, also running processes. for now, that's enough for me.
Just remember that at least on linux, the cpu usage information presented by SNMPD is wildly inaccurate, so don't rely on it for that particular stat. I wrote my own little scripts to feed into snmpd to get cpu usage and associated templates etc for cacti.
Can you be more specific about how snmp is wrong and what you do to get a more accurate value? Is it just that the snmp value needs to be scaled by the number of processors?
Les Mikesell wrote:
Can you be more specific about how snmp is wrong and what you do to get a more accurate value? Is it just that the snmp value needs to be scaled by the number of processors?
Seems like the SNMPD included in CentOS 5.x has improved somewhat vs v4.
From the FAQ
What do the CPU statistics mean - is this the load average? ----------------------------------------------------------
No. Unfortunately, the original definition of the various CPU statistics was a little vague. It referred to a "percentage", without specifying what period this should be calculated over. It was therefore implemented slightly differently on different architectures.
Recent releases includes "raw counters", which can be used to calculate the percentage usage over any desired period. This is the "right" way to handle things in the SNMP model. The original flawed percentage objects should not be used, and will be removed in a future release of the agent.
Note that this is different from the Unix load average, which is available via the loadTable, and is supported on all architectures.
---
Older versions would basically spit out random values for CPU usage. For about the past 5 years I have used scripts that run out of cron, that run sar and parse the output and send the results to a file, then configure SNMP to tail that file when a particular OID is queried. This has given me really dependable results over the years.
[root@us-cfe002:/home/monitor/stats]# tail -n 1 * ==> disk.usage <== DISK_T:60707 DISK_U:9567
==> mem.usage <== RAM_T:3950 RAM_F:2732 RAM_B:58 RAM_C:731 SWAP_T:8189 SWAP_U:0
==> sar.usage <== USER:0.01 NICE:0.00 SYS:0.01 IO:0.00 FAULT:41.16 TCPSOCK:21
Last I checked as well the SNMP daemon didn't return cpu i/o wait values, which is pretty handy to have.
Then I have a script that queries the data(along with other data) and feeds it into cacti as a single set of results (to be stored in 1 RRD file) which really helps cacti scale
[cacti@dc1-mon002:~/bin]$ ./linux-basics-net.pl us-cfe002 public USER:0.01 NICE:0.00 SYS:0.02 IO:0.00 FAULT:61.78 TCPSOCK:21 RAM_T:3950 RAM_F:2732 RAM_B:58 RAM_C:731 SWAP_T:8189 SWAP_U:0 DISK_T:60707 DISK_U:9567 1MIN:0.00 5MIN:0.00 15MIN:0.00 E0_IN:747203652 E0_OUT:520021358 E1_IN:0 E1_OUT:0
Unfortunately with every passing revision of sar it becomes more and more difficult to parse, I really miss the version from RHEL 3 days, that one was great, it had a special human readable output option which has since been taken out (it would spit out each stat on one line making it easy to parse).
nate
nate wrote:
Last I checked as well the SNMP daemon didn't return cpu i/o wait values, which is pretty handy to have.
It must... I haven't waded through the details of how it does it, but a default OpenNMS install will collect and graph a CPU usage chart that stacks user/nice/wait/system/interrupts and seems accurate except that it is per-cpu (i.e. will go to 400% on a hyperthreaded dual-cpu box).
It also does a CPU statistics chart that does a line graph for the 1/5/15 minute values with the space under the line color-coded for %cpu utilization.
Then I have a script that queries the data(along with other data) and feeds it into cacti as a single set of results (to be stored in 1 RRD file) which really helps cacti scale
[cacti@dc1-mon002:~/bin]$ ./linux-basics-net.pl us-cfe002 public USER:0.01 NICE:0.00 SYS:0.02 IO:0.00 FAULT:61.78 TCPSOCK:21 RAM_T:3950 RAM_F:2732 RAM_B:58 RAM_C:731 SWAP_T:8189 SWAP_U:0 DISK_T:60707 DISK_U:9567 1MIN:0.00 5MIN:0.00 15MIN:0.00 E0_IN:747203652 E0_OUT:520021358 E1_IN:0 E1_OUT:0
And it does a system memory stats graph with color-coded: used/io_buff/shared/filesytem cache/available/swap/real values.
Unfortunately with every passing revision of sar it becomes more and more difficult to parse, I really miss the version from RHEL 3 days, that one was great, it had a special human readable output option which has since been taken out (it would spit out each stat on one line making it easy to parse).
You might want to look at opennms if you haven't already. Now that they have a yum repo it is very easy to install.
Les Mikesell wrote:
nate wrote:
Last I checked as well the SNMP daemon didn't return cpu i/o wait values, which is pretty handy to have.
It must... I haven't waded through the details of how it does it, but a default OpenNMS install will collect and graph a CPU usage chart that stacks user/nice/wait/system/interrupts and seems accurate except that it is per-cpu (i.e. will go to 400% on a hyperthreaded dual-cpu box).
Strange since the FAQ for snmpd specifically says per-cpu stats are not accurate.
What about multi-processor systems? ----------------------------------
Sorry - the CPU statistics (both original percentages, and the newer raw statistics) both refer to the system as a whole. There is currently no way to access individual statistics for a particular processor (except on Solaris systems - see below).
Note that although the Host Resources table includes a hrProcessorTable, the current implementation suffers from two major flaws. Firstly, it doesn't currently recognise the presence of multiple processors, and simply assumes that all systems have precisely one CPU. Secondly, it doesn't calculate the hrProcessorLoad value correctly, and either returns a dummy value (based on the load average) or nothing at all.
As of net-snmp version 5.1, the Solaris operating system delivers some information about multiple CPU's such as speed and type.
Other than that, to monitor a multi-processor system, you're currently out of luck. We hope to address this in a future release of the agent. But you've got the source, so you can always have a go yourself :-)
---
I'm not aware of any other tool that reports stats on a per-CPU basis(e.g. sar, vmstat, etc) on a 2.6.x kernel, though per-cpu stats were available in older 2.4.x kernels with SAR at least, though I've always only been interested in cpu usage as a whole rather than per-cpu stats. My main cacti server has 2555 graphs as it is.
I have too many hundreds of hours invested in cacti right now to make the jump to anything else at the moment..but perhaps some day I will jump ship and use something else, or go back to writing my own, which I used to do in order to get higher resolution monitoring several years ago(e.g. 10,30,60 second intervals). My cacti collects about 11 million data points a day today with room on the hardware to probably go to 25 million before needing a 2nd server(dual proc quad core 16GB).
nate
so has anybody had any good exp. with zenoss? I fired it up once for a couple of hours, tried to get the disk usage info using it, but failed. didn't have that much time to get into it, just used cacti instead.. but i would love to get some more functionality out of it, like auto discovery (the plugin for cacti doesn't do anything useful)
On Wed, Oct 8, 2008 at 7:01 PM, Alejandro cdgraff@gmail.com wrote:
Hi all,
Currently I have a big question.
What is the best OPEN SOURCE solution for monitoring multiple Host and Services, for example for using in a WebHosting Provider with 50 hosts or more.
I've had good luck with "mon". Written in Perl and supports plug-in monitor and alert modules. The modules are easy to write.
It's better than most largely advertised system monitoring/alert packages because it is simple in design, configuration is flexible, and it supports a large number of hosts.
http://mon.wiki.kernel.org/index.php/Main_Page
Currently, I'm using NAGIOS for more than 3 years, this is a incredible tool, but before upgrade to version 3, I have this existential question.
Please fill free to recommend any soft, but with this option in mind. 1- Easy option to personalize any plug-in 2- Good Documentation 3- Support large platforms 4- Run over Linux
Sorry for me English Regards Alejandro www.linuxiso.com.ar from Buenos Aires, Argentina
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Hi all,
Currently I discover this link:
http://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems
Is a excellent reference about the more know NMS Software.
Regards, Alejandro www.linuxiso.com.ar
2008/10/15 "José Mejía , Ayto de l'Alcora" jmejia@alcora.org
Hello:
I'm using hobbit :
http://hobbitmon.sourceforge.net/ http://www.hswn.dk/hobbit (live system)
"Hobbit monitors your *hosts*, your *network services*, and *anything else
- you configure it to do via extensions. Hobbit can periodically generate
requests to network services
- http, ftp, smtp and so on - and record if the service is responding as
expected. You can also monitor local disk utilisation, logfiles and processes through the use of agents installed on the servers."
It's not as complete as nagios but works fine
Greetings
Tim Berger escribió:
On Wed, Oct 8, 2008 at 7:01 PM, Alejandro cdgraff@gmail.com wrote:
Hi all,
Currently I have a big question.
What is the best OPEN SOURCE solution for monitoring multiple Host and Services, for example for using in a WebHosting Provider with 50 hosts or more.
I've had good luck with "mon". Written in Perl and supports plug-in monitor and alert modules. The modules are easy to write.
It's better than most largely advertised system monitoring/alert packages because it is simple in design, configuration is flexible, and it supports a large number of hosts.
http://mon.wiki.kernel.org/index.php/Main_Page
Currently, I'm using NAGIOS for more than 3 years, this is a incredible tool, but before upgrade to version 3, I have this existential question.
Please fill free to recommend any soft, but with this option in mind. 1- Easy option to personalize any plug-in 2- Good Documentation 3- Support large platforms 4- Run over Linux
Sorry for me English Regards Alejandro www.linuxiso.com.ar from Buenos Aires, Argentina
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
-- -Tim
CentOS mailing listCentOS@centos.orghttp://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Alejandro wrote:
Hi all,
Currently I discover this link:
http://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems
Is a excellent reference about the more know NMS Software.
I don't think I'd trust that very much - I think all of the "no" blocks for Opennms are wrong.