Service monitoring/"Monit"?

List overview All Threads
Download

newer

older

Monitor Network Traffic

DHCP/DNS server for LAN with...

Toralf Lund

11 Jun 2010 11 Jun '10

7:21 a.m.

Has anybody here tried the "Monit" utility (http://mmonit.com/monit/)? I need to set up some kind of "watchdog" functionality for a custom service otherwise started via init (i.e. via a script in /etc/init.d + rc*.d links managed by chkconfig) and it seems like this system may give me nearly what I want. However, something that concerns me about it, is that it appears to take full ownership of the services it monitors, and pretty much bypass the init system, in that once I've told monit to watch a process, it makes not difference if the associated service is enabled or not (via "chkconfig") , and there is no good way to fully stop the service, besides removing the Monit config or disabling Monit itself. (Or maybe I can use special "monit" commands, too, but that means precisely that Monit "owns" the service, which I somehow don't quite like.)

What I'd really like, is to have my process watched and restarted if it goes a way, but only if it had already been started via "init" or through the "service" command, or possibly direct execution of the "init.d" script. This also means it should be restarted if it crashes or if is stopped via a direct "kill", but not after "service <name> stop" or similar. Which might mean it all boils down to checking if, and only if, the pid file for the service already exists. (Does that make any sense?)

So, does anyone know if there is a way to convince Monit to do this for me? Or alternatively, is there a different tool that might do the job?

And yes, I know it shouldn't be that hard to design a program/script/cron job to do it all for me, but I still think not having to maintain another software component would be nice...

- Toralf

This e-mail, including any attachments and response string, may contain proprietary information which is confidential and may be legally privileged. It is for the intended recipient only. If you are not the intended recipient or transmission error has misdirected this e-mail, please notify the author by return e-mail and delete this message and any attachment immediately. If you are not the intended recipient you must not use, disclose, distribute, forward, copy, print or rely on this e-mail in any way except as permitted by the author.

Show replies by date

Pavel Lisý

11 Jun 11 Jun

8:21 a.m.

Toralf Lund píše v Pá 11. 06. 2010 v 09:21 +0200:

...

Has anybody here tried the "Monit" utility (http://mmonit.com/monit/)? I need to set up some kind of "watchdog" functionality for a custom service otherwise started via init (i.e. via a script in /etc/init.d + rc*.d links managed by chkconfig) and it seems like this system may give me nearly what I want. However, something that concerns me about it, is that it appears to take full ownership of the services it monitors, and pretty much bypass the init system, in that once I've told monit to watch a process, it makes not difference if the associated service is enabled or not (via "chkconfig") , and there is no good way to fully stop the service, besides removing the Monit config or disabling Monit itself. (Or maybe I can use special "monit" commands, too, but that means precisely that Monit "owns" the service, which I somehow don't quite like.)

What I'd really like, is to have my process watched and restarted if it goes a way, but only if it had already been started via "init" or through the "service" command, or possibly direct execution of the "init.d" script. This also means it should be restarted if it crashes or if is stopped via a direct "kill", but not after "service <name> stop" or similar. Which might mean it all boils down to checking if, and only if, the pid file for the service already exists. (Does that make any sense?)

So, does anyone know if there is a way to convince Monit to do this for me? Or alternatively, is there a different tool that might do the job?

And yes, I know it shouldn't be that hard to design a program/script/cron job to do it all for me, but I still think not having to maintain another software component would be nice...

Try this:

1. put to /etc/monit.conf this line: include /etc/monit.d/*.conf

2. for active services make you own files in /etc/monit.d, ex. /etc/monit.d/sshd.conf:

# ssh control check process sshd with pidfile /var/run/sshd.pid start program "/etc/init.d/sshd start" stop program "/etc/init.d/sshd stop" if 5 restarts within 5 cycles then timeout

3. when you change service to chkconfig sshd off

rename /etc/monit.d/sshd.conf to something different: mv /etc/monit.d/sshd.conf /etc/monit.d/sshd.conf-dontuse

Pavel

Toralf Lund

8:34 a.m.

Pavel Lisý wrote:

...

Toralf Lund píše v Pá 11. 06. 2010 v 09:21 +0200:

...
Has anybody here tried the "Monit" utility (http://mmonit.com/monit/)? I need to set up some kind of "watchdog" functionality for a custom service otherwise started via init (i.e. via a script in /etc/init.d + rc*.d links managed by chkconfig) and it seems like this system may give me nearly what I want. However, something that concerns me about it, is that it appears to take full ownership of the services it monitors, and pretty much bypass the init system, in that once I've told monit to watch a process, it makes not difference if the associated service is enabled or not (via "chkconfig") , and there is no good way to fully stop the service, besides removing the Monit config or disabling Monit itself. (Or maybe I can use special "monit" commands, too, but that means precisely that Monit "owns" the service, which I somehow don't quite like.)

What I'd really like, is to have my process watched and restarted if it goes a way, but only if it had already been started via "init" or through the "service" command, or possibly direct execution of the "init.d" script. This also means it should be restarted if it crashes or if is stopped via a direct "kill", but not after "service <name> stop" or similar. Which might mean it all boils down to checking if, and only if, the pid file for the service already exists. (Does that make any sense?)

So, does anyone know if there is a way to convince Monit to do this for me? Or alternatively, is there a different tool that might do the job?

And yes, I know it shouldn't be that hard to design a program/script/cron job to do it all for me, but I still think not having to maintain another software component would be nice...

Try this:

put to /etc/monit.conf this line:

include /etc/monit.d/*.conf

for active services make you own files in /etc/monit.d,

ex. /etc/monit.d/sshd.conf:

# ssh control check process sshd with pidfile /var/run/sshd.pid start program "/etc/init.d/sshd start" stop program "/etc/init.d/sshd stop" if 5 restarts within 5 cycles then timeout

when you change service to

chkconfig sshd off

rename /etc/monit.d/sshd.conf to something different: mv /etc/monit.d/sshd.conf /etc/monit.d/sshd.conf-dontuse

I think you are missing my point. This is precisely what I do not want to do. The last bit here, I mean - i.e. I'm asking for a way to set up so the "rename /etc/monit.d/sshd.conf to something different" step won't be necessary.

- Toralf

Geoff Galitz

8:51 a.m.

...

...
I think you are missing my point. This is precisely what I do not want to do. The last bit here, I mean - i.e. I'm asking for a way to set up so the "rename /etc/monit.d/sshd.conf to something different" step won't be necessary.

Depending on how much effort you are willing to put into this, you can get nagios to do this. There are two options:

1) Create a custom plugin that checks "service [app] status" or directly check the pid file and pgrep/grep for the app.

2) Write a small script that you stick into the service init script that tells nagios to start or stop monitoring a service. I used perl and LWP to do something very similar but you can probably find another CPAN module or something similar to act as interface to nagios.

If you are comfortable with scripting and nagios you can bang out a solution in an afternoon. Option 1 you can probably do in less than an hour and is probably better for you.

--------------------------------- Geoff Galitz Blankenheim NRW, Germany http://www.galitz.org/ http://german-way.com/blog/

Toralf Lund

11:07 a.m.

Geoff Galitz wrote:

...

...
I think you are missing my point. This is precisely what I do not want to do. The last bit here, I mean - i.e. I'm asking for a way to set up so the "rename /etc/monit.d/sshd.conf to something different" step won't be necessary.

Depending on how much effort you are willing to put into this, you can get nagios to do this. There are two options:

Hmmm... I've used Nagios before, but it must be 10 years ago now, so probably I'd have to re-learn it. But I'm sure it could be done.

I have a feeling that introducing nagios monitoring is a little too involved, though. I want to distribute the setup to external systems, so ideally there should be one "monitoring" package install and a simple "enable" command at the most, in addition to installation of the actual software.

rpm -Uvh monit-<version>.rpm chkconfig monit on

Is really quite ideal, except for the fact that it only nearly gives me what I want :-(

But thanks anyway,

- Toralf

...

Create a custom plugin that checks "service [app] status" or directly

check the pid file and pgrep/grep for the app.

Write a small script that you stick into the service init script that

tells nagios to start or stop monitoring a service. I used perl and LWP to do something very similar but you can probably find another CPAN module or something similar to act as interface to nagios.

If you are comfortable with scripting and nagios you can bang out a solution in an afternoon. Option 1 you can probably do in less than an hour and is probably better for you.

Geoff Galitz Blankenheim NRW, Germany http://www.galitz.org/ http://german-way.com/blog/

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Geoff Galitz

11:39 a.m.

...

I have a feeling that introducing nagios monitoring is a little too involved, though. I want to distribute the setup to external systems, so ideally there should be one "monitoring" package install and a simple "enable" command at the most, in addition to installation of the actual software.

rpm -Uvh monit-<version>.rpm chkconfig monit on

Is really quite ideal, except for the fact that it only nearly gives me what I want :-(

Gotcha. Would a simple shell script be sufficient? Something like:

------------------------------- while : ## loop forever do /sbin/service [app] status if [$? -ne 0 ]; ## services which should be running, but are dead ## return a non-zero status code then service [app] restart fi sleep 120 ## sleep for about two minutes do -------------------------------

Adding mail notifications and other standard functions would be trivial. To monitor various services just create add a for loop in there.

Just a thought. -Geoff

--------------------------------- Geoff Galitz Blankenheim NRW, Germany http://www.galitz.org/ http://german-way.com/blog/

Toralf Lund

1:14 p.m.

Geoff Galitz wrote:

...

...
I have a feeling that introducing nagios monitoring is a little too involved, though. I want to distribute the setup to external systems, so ideally there should be one "monitoring" package install and a simple "enable" command at the most, in addition to installation of the actual software.

rpm -Uvh monit-<version>.rpm chkconfig monit on

Is really quite ideal, except for the fact that it only nearly gives me what I want :-(

Gotcha. Would a simple shell script be sufficient? Something like:

Yeah, I might just end up doing something like this...

I suppose I could also have "cron" control the "iterations" rather than doing a loop with wait, but that's just a matter of taste...

...

while : ## loop forever do /sbin/service [app] status if [$? -ne 0 ]; ## services which should be running, but are dead ## return a non-zero status code then service [app] restart fi sleep 120 ## sleep for about two minutes do

Adding mail notifications and other standard functions would be trivial. To monitor various services just create add a for loop in there. Just a thought.

Yep. Maybe it can be that simple. I suppose I just wanted to know if somebody else had published something that would essentially be this script with a few of the trivial extensions you mention, as I don't want to re-invent the wheel even if it's a very basic one. But perhaps not...

- Toralf

...

-Geoff

Geoff Galitz Blankenheim NRW, Germany http://www.galitz.org/ http://german-way.com/blog/

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Karanbir Singh

8:35 a.m.

On 11/06/2010 08:21, Toralf Lund wrote:

...

Has anybody here tried the "Monit" utility (http://mmonit.com/monit/)?

For what you are trying to do - take a look at God ( http://god.rubyforge.org/ ) instead or : as lesser complex solution - bluepill ( http://github.com/arya/bluepill )

I've used Monit quite extensively in the past and found it to be quite limiting in that it wont let you easily wrap your app into a self managed bundle, or be able to use arbitrary conditions to influence monitoring state. Eg. When you are doing a yum upgrade httpd; you dont really want monit to try starting httpd when the rpm -e has just happened.

In your specific case, it will be nearly impossible for you to get Monit to do something like ' keen an eye on sshd, but only if it was already running and keep it in its state I left it manually.'

hth

- KB

Toralf Lund

9:49 a.m.

Karanbir Singh wrote:

...

On 11/06/2010 08:21, Toralf Lund wrote:

...
Has anybody here tried the "Monit" utility (http://mmonit.com/monit/)?

For what you are trying to do - take a look at God ( http://god.rubyforge.org/ ) instead or : as lesser complex solution - bluepill ( http://github.com/arya/bluepill )

OK, thanks, I'll have a look.

...

I've used Monit quite extensively in the past and found it to be quite limiting in that it wont let you easily wrap your app into a self managed bundle,

That's actually part of what I want to do. I'd like to have an rpm package install put a "real" monitoring config in /etc/monitors.d, but I can't really do it if that means automatically starting the service - it must be possible (and simple) to install the software without forcing it to run directly.

...

or be able to use arbitrary conditions to influence monitoring state. Eg. When you are doing a yum upgrade httpd; you dont really want monit to try starting httpd when the rpm -e has just happened.

Quite. But if you could set up the way I want, it wouldn't, provided that it was stopped in a controlled manner (which is something one of the rpm scriptlets might do.)

...

In your specific case, it will be nearly impossible for you to get Monit to do something like ' keen an eye on sshd, but only if it was already running and keep it in its state I left it manually.'

Too bad, really, since it seems like it very nearly has what's needed. There is also a "dependency" mechanism that would do this, if only it implemented real, hard dependencies. Right now, (as far as I can tell) "depending" on another service merely means that monit will try to start the other service first - the "dependent" service will not be stopped or skipped if the operation fails.

- Toralf

Karanbir Singh

10:21 a.m.

On 11/06/2010 10:49, Toralf Lund wrote:

...

That's actually part of what I want to do. I'd like to have an rpm package install put a "real" monitoring config in /etc/monitors.d, but I can't really do it if that means automatically starting the service - it must be possible (and simple) to install the software without forcing it to run directly.

God and Bluepill can both do these things. Its a classic case of what I call reactive-monitoring. So you only look at specific conditions - then wrap them around a policy each. The problem with monit is that its unable to handle more than one condition in one run cycle, and its extremely hard to do co-ordinated scheduling across tasks using the monit config files.

Imho, Monit is a good implementation of init; and useful for situations where the app can handle contingencies and policy itself.

...

Too bad, really, since it seems like it very nearly has what's needed. There is also a "dependency" mechanism that would do this,

Deps are important, specially when you daisy chain tasks. eg. Nagios to monitor machine state, user facing external ( or cross machine ) interfaces, and BluePill. Then have BluePill handle app state locally.

- KB

Toralf Lund

11:25 a.m.

Karanbir Singh wrote:

...

On 11/06/2010 10:49, Toralf Lund wrote:

...
That's actually part of what I want to do. I'd like to have an rpm package install put a "real" monitoring config in /etc/monitors.d, but I can't really do it if that means automatically starting the service - it must be possible (and simple) to install the software without forcing it to run directly.

God and Bluepill can both do these things. Its a classic case of what I call reactive-monitoring.

OK.

From reading the docs on the web sites, there are a number of things that are not quite clear, like what the "process" tests actually do, or how I might combine a file test ("does the pid file xist"?) with them (f that's what I want. This is perhaps partly because I don't speak Ruby, and I'm sure what I want is doable...

Do you happen to know if any of these tools are available from yum repositories and/or in rpm package form?

...

So you only look at specific conditions - then wrap them around a policy each. The problem with monit is that its unable to handle more than one condition in one run cycle, and its extremely hard to do co-ordinated scheduling across tasks using the monit config files.

Imho, Monit is a good implementation of init; and useful for situations where the app can handle contingencies and policy itself.

...
Too bad, really, since it seems like it very nearly has what's needed. There is also a "dependency" mechanism that would do this,

Deps are important,

Definitely. But do my mind, "service A depends on service B" should mean that if service B isn't active, and can't be started, then there will be no attempt to start service B, either - and there might even be an attempt to stop it if it's already running. Not so with Monit, like I said...

- Toralf

...

specially when you daisy chain tasks. eg. Nagios to monitor machine state, user facing external ( or cross machine ) interfaces, and BluePill. Then have BluePill handle app state locally.

KB

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Karanbir Singh

1:59 p.m.

On 11/06/2010 12:25, Toralf Lund wrote:

...

From reading the docs on the web sites, there are a number of things that are not quite clear, like what the "process" tests actually do, or how I might combine a file test ("does the pid file xist"?) with them (f that's what I want. This is perhaps partly because I don't speak Ruby, and I'm sure what I want is doable...

Ok, that might be a bit of an issue for you - but basically, you get an object that you can do any sort of tests on. eg

restart.condition(:memory_usage) do |mu| mu.above = 1024.megabytes mu.times = [ 5,7 ] end

Which will restart the process if 5 of the last 7 checks indicated it was using more than 1G of ram. Its basically just ruby and coding around it, so you can define what the logic or smartness you want around the monitoring and just code it in.

...

Do you happen to know if any of these tools are available from yum repositories and/or in rpm package form?

Not that I am aware of, I've been using them from an internal repo we have here at work, but thats heavily optimised to our own specific requirements ( eg. running on ruby 191 etc ). There is a gem2spec app that will hapily convert a ruby gem into a rpm spec file.

...

Definitely. But do my mind, "service A depends on service B" should mean that if service B isn't active, and can't be started, then there will be no attempt to start service B, either - and there might even be an attempt to stop it if it's already running. Not so with Monit, like I said...

This is where the flexibility of God comes into play, you can write any sort of logic around the conditionals.

- KB

5550

Age (days ago)

5550

Last active (days ago)

discuss@lists.centos.org

11 comments

4 participants

tags (0)

participants (4)

Geoff Galitz
Karanbir Singh
Pavel Lisý
Toralf Lund