Tony Mountifield wrote: > I have a small number of boxes in different locations, and currently have > a fairly crude cron job running on each, which does a ping of one or more > of the other boxes, and if the ping fails, it emails me to say the other > box might be down. It then emails me again the next time the other box > appears to be up. > > Of course, this can't distinguish between the remote box really being down > and there being a network problem somewhere between the local and remote > boxes. > > I've been mulling over the idea of a more sophisticated scheme, where > a number of boxes send each other messages, indicating not only their > presence, but which other boxes they believe to be up. Then if a box > goes down, the other boxes all see it has gone and agree that it really > is down. However, if there is instead a network outage or routing flap > so that a box is reachable from some places but not all, it might be > possible to distinguish this case. > > So my question is: does anyone know of an existing too that does this > sort of thing? > > Cheers > Tony > > Tony, Nagios, maybe. http://www.nagios.org/ Not familiar with it, but there has been a lot of talk on the list. Bob...