[CentOS] Service monitoring/"Monit"?

Fri Jun 11 08:21:22 UTC 2010
Pavel Lisý <pavel.lisy at gmail.com>

Toralf Lund píše v Pá 11. 06. 2010 v 09:21 +0200:
> Has anybody here tried the "Monit" utility (http://mmonit.com/monit/)? I 
> need to set up some kind of "watchdog" functionality for a custom 
> service otherwise started via init (i.e. via a script in /etc/init.d + 
> rc*.d links managed by chkconfig) and it seems like this system may give 
> me nearly what I want. However, something that concerns me about it, is 
> that it appears to take full ownership of the services it monitors, and 
> pretty much bypass the init system, in that once I've told monit to 
> watch a process, it makes not difference if the associated service is 
> enabled or not (via "chkconfig") , and there is no good way to fully 
> stop the service, besides removing the Monit config or disabling Monit 
> itself. (Or maybe I can use special "monit" commands, too, but that 
> means precisely that Monit "owns" the service, which I somehow don't 
> quite like.)
> 
> What I'd really like, is to have my process watched and restarted if it 
> goes a way, but only if it had already been started via "init" or 
> through the "service" command, or possibly direct execution of the 
> "init.d" script. This also means it should be restarted if it crashes or 
> if is stopped via a direct "kill", but not after "service <name> stop" 
> or similar. Which might mean it all boils down to checking if, and only 
> if, the pid file for the service already exists.  (Does that make any 
> sense?)
> 
> So, does anyone know if there is a way to convince Monit to do this for 
> me? Or alternatively, is there a different tool that might do the job?
> 
> And yes, I know it shouldn't be that hard to design a 
> program/script/cron job to do it all for me, but I still think not 
> having to maintain another software component would be nice...
Try this:

1. put to /etc/monit.conf this line:
include /etc/monit.d/*.conf

2. for active services make you own files in /etc/monit.d,
ex. /etc/monit.d/sshd.conf:

# ssh control
check process sshd with pidfile /var/run/sshd.pid
  start program "/etc/init.d/sshd start"
  stop program "/etc/init.d/sshd stop"
  if 5 restarts within 5 cycles then timeout

3. when you change service to 
chkconfig sshd off

rename /etc/monit.d/sshd.conf to something different:
mv /etc/monit.d/sshd.conf /etc/monit.d/sshd.conf-dontuse

Pavel