+1 to your logrotate thought; I'd dig deeper there.
check /var/lib/logrotate.status; see if it doesn't match up with days the failover happens, that different httpd logs are rotating.
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of John Horne Sent: Wednesday, November 12, 2014 10:36 AM To: CentOS mailing list Subject: Re: [CentOS] Keepalived - spurious failovers
On Wed, 2014-11-12 at 10:27 -0500, m.roth@5-cent.us wrote:
John Horne wrote:
We are using CentOS 6.6 and keepalived 1.2.13 on two servers for failover, no load-balancing. Failover is governed by the NIC being present, and the Apache and Tomcat processes being present. Both servers are configured as 'EQUAL' (not master/backup). An initial priority of 100 is set, and if a process or NIC fails, then this is reduced by 60 - causing a lower priority to be seen and failover to take place. Generally this works well. If we stop the network or one of the processes, this is logged (to /var/log/messages) and failover happens within a few seconds.
However, we have had failovers occur during the night several times. It happened last night, and the night before. Nothing was logged in the messages file about the NIC being down, or the Apache/Tomcat processes being unavailable. Nothing was logged by the Apache or Tomcat processes in their own log files. The failovers have happened at 03:56 on both nights.
The most obvious suspect causing this would be some nighttime process such as log rotation or automatic updates. However, I can see nothing obvious occurring during the night that would cause the keepalived virtual interface to failover.
<snip> I trust you've looked at the crontab, and /etc/cron.daily, etc.
Yes. Nothing obvious that would cause a problem to apache/tomcat or the network.
The other option: have you looked *outside* the systems? Do you have a cable between the two, or is it over the network? Is there a network thing going on? For example, are the servers on a UPS, and the switch they're on not on one?
They are both virtual servers - so no UPS. Failover communication is over the network.
John.