Re: [CentOS] Monitoring services

16 Dec 2011


      On Fri, Dec 16, 2011 at 12:02 PM, Alan McKay alan.mckay@gmail.com wrote:
...
...
...
Thoughts form anyone on any of this?
Network monitoring is not trivial no matter what tool you use.  Pick
something that you trust to scale to the proportions you will need so
you don't do a lot of work and then hit a wall.   And if you have a
lot of systems, avoid anything that needs per-system configuration or
agent installation.
Agreed.  I'm definitely not looking for trivial - just trying to make sure
I understand the strengths and weaknesses of each system to help me make
the right decision.  Because once I've made that decision, I have to live
with it :-)   Our environment is relatively small.  About 80 servers that
are mostly grouped into 3 compute clusters for the scientists I support.  A
few switches, and no routers under my direct control (though a few Linux
boxes routing between NICs since some of the environment is on our own
private LAN behind said Linux box, cut off from the Hospital's network)
You may not need 'direct' control of the routers - just read access
for snmp to monitor them.  And if the switches have snmp you can get
per-interface traffic which will obviously match whatever is on the
other end of the wire.  Does the cluster software have its own
close-coupled monitor like ganglia?   One thing I haven't found in any
of the frameworks I've seen that everybody is likely to need is a good
concept of aggregates.  That is, you will have some level of
redundancy in fail-over sets and some level of group capacity in
load-balanced sets.  While you may want to be alerted about individual
failures, what you really need to track is how close you are to
capacity across the working group members - and nothing does that very
well.
-- 
   Les Mikesell
    lesmikesell@gmail.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [CentOS] Monitoring services