[CentOS] Somewhat OT: (Nagios)

Sergio Belkin wrote:
> 2008/5/13  <jleaver+centos at reachone.com>:
> 
> OK, you won :) I'm going to test  nagios. I am using centos 5.1
> x86_64. Do I lose much if I use rpm from rpmforge (version 2.9)?
> 

We're running version 2.11 at the office (on CentOS 5.1 x86_64).  I've 
looked at some of the things in 3.0, but there's nothing there that I 
needed yet.

Hopefully you have some way to track changes in /etc/nagios (FSVS is 
what we use), because it will make your life much easier to have an 
audit trail.

We created sub-folders under /etc/nagios to hold the various types of 
entities.  For example, we have:

/etc/nagios/commands
/etc/nagios/contacts
/etc/nagios/contactgroups
/etc/nagios/hosts-switches
/etc/nagios/hosts-dmz
/etc/nagios/hosts-servers
/etc/nagios/hosts-lan
/etc/nagios/templates-hosts
/etc/nagios/templates-services

We then broke individual elements out of the default massive 
configuration folder into individual .cfg files.  For example, we chose 
to create individual files for each contact rather the putting them all 
in a single file.  So far it works well, it's a lot easier to get a feel 
for what users have been defined, what hosts are defined, what the 
templates are.  Because when I look in templates-services, I see from 
the directory listing that I have service templates named X, Y and Z 
(without having to open up the file to look).

We currently put service checks for individual hosts in the same 
configuration file as the host.  So you will have the following 
definitions in a typical host file (until you get into templating):

define host{
define hostextinfo{
define service{
define service{
...

Any plugins that we wrote ourself, we put under a separate folder. 
Which keeps them separate from

/usr/local/lib64/nagios-plugins/

Basically, start small, track your changes, and plan on refactoring it 
in week #2 after you start monitoring about a dozen hosts.  Stay away 
from advanced things like escalation, monitoring things like disk space 
on remote servers, or the like until you get the basics working.

Oh, and SELinux will probably get in your way.  So you'll need to play 
with audit2allow to create supplemental policy to give Nagios additional 
permissions.  (Which may have been due to PEBKAC issues on my end - I 
plan on going back and looking at labeling and figuring out what I 
mislabeled.)

I think that's the majority of the issues that we dealt with in the past 
2 weeks.  We're now in fine-tuning mode and getting ready to start 
monitoring remote services next week.