[CentOS] Keepalived - spurious failovers

Wed Nov 12 15:44:05 UTC 2014
Richard Mann <rmann at ilsworld.com>


+1 to your logrotate thought; I'd dig deeper there.

check /var/lib/logrotate.status; see if it doesn't match up with days the failover happens, that different httpd logs are rotating.  


-----Original Message-----
From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On Behalf Of John Horne
Sent: Wednesday, November 12, 2014 10:36 AM
To: CentOS mailing list
Subject: Re: [CentOS] Keepalived - spurious failovers

On Wed, 2014-11-12 at 10:27 -0500, m.roth at 5-cent.us wrote:
> John Horne wrote:
> >
> > We are using CentOS 6.6 and keepalived 1.2.13 on two servers for
> > failover, no load-balancing. Failover is governed by the NIC being
> > present, and the Apache and Tomcat processes being present. Both servers
> > are configured as 'EQUAL' (not master/backup). An initial priority of
> > 100 is set, and if a process or NIC fails, then this is reduced by 60 -
> > causing a lower priority to be seen and failover to take place.
> > Generally this works well. If we stop the network or one of the
> > processes, this is logged (to /var/log/messages) and failover happens
> > within a few seconds.
> >
> > However, we have had failovers occur during the night several times. It
> > happened last night, and the night before. Nothing was logged in the
> > messages file about the NIC being down, or the Apache/Tomcat processes
> > being unavailable. Nothing was logged by the Apache or Tomcat processes
> > in their own log files. The failovers have happened at 03:56 on both
> > nights.
> >
> > The most obvious suspect causing this would be some nighttime process
> > such as log rotation or automatic updates. However, I can see nothing
> > obvious occurring during the night that would cause the keepalived
> > virtual interface to failover.
> <snip>
> I trust you've looked at the crontab, and /etc/cron.daily, etc.
>
Yes. Nothing obvious that would cause a problem to apache/tomcat or the
network.

> The other option: have you looked *outside* the systems? Do you have a
> cable between the two, or is it over the network? Is there a network
> thing going on? For example, are the servers on a UPS, and the switch
> they're on not on one?
> 
They are both virtual servers - so no UPS. Failover communication is
over the network.



John.

-- 
John Horne                   Tel: +44 (0)1752 587287
Plymouth University, UK

_______________________________________________
CentOS mailing list
CentOS at centos.org
http://lists.centos.org/mailman/listinfo/centos