On May 29, 2007, at 4:24 PM, Tony Mountifield wrote: > I have a small number of boxes in different locations, and > currently have > a fairly crude cron job running on each, which does a ping of one > or more > of the other boxes, and if the ping fails, it emails me to say the > other > box might be down. It then emails me again the next time the other box > appears to be up. > > Of course, this can't distinguish between the remote box really > being down > and there being a network problem somewhere between the local and > remote > boxes. > > I've been mulling over the idea of a more sophisticated scheme, where > a number of boxes send each other messages, indicating not only their > presence, but which other boxes they believe to be up. Then if a box > goes down, the other boxes all see it has gone and agree that it > really > is down. However, if there is instead a network outage or routing flap > so that a box is reachable from some places but not all, it might be > possible to distinguish this case. > > So my question is: does anyone know of an existing too that does this > sort of thing? > > Cheers > Tony Nagios does this... although it can be a bit much to configure. And what you're particularly looking for seems to be "dependency" support, ie If your gateway is down, you don't want to be notified that every server you have to connect through that gateway is also down. A nice basic tutorial for Nagios I found is at: http://www2.maxsworld.org/howtos/nagios.html It doesn't delve on dependencies too much, but it shouldn't be that difficult. dex ---------- Mobile: +63 (917) 5357191, Office: +63 (2) 6312718 i4 Asia Incorporated - http://www.i4asiacorp.com/