[CentOS] Cluster Failover Troubleshooting (luci and ricci)

Fri Jul 1 20:51:10 UTC 2011
Ljubomir Ljubojevic <office at plnet.rs>

Ryan Bunce wrote:
> I must say that I am less familiar with how all of the cluster 
> components work together. All of the Linux clusters I have built thus 
> far have been heartbeat+mon style clusters.
> 
> I'm looking to find out if there is an additional debug layer that I can 
> put in place to get some more detailed information about what is 
> transacting (or not) between the two cluster members.
> 
> Many thanks.

I never installed or used any Conga/lucci/ricci sistem.

But as far as I know and understand, you need to have a way for server 
failing to warn the rest of the nodes. Your log said it failed.

Some of the failover sistems need separate network connected to 
collective file systems. So when eth0 is not working, main node will use 
  eth1(2,3,4) to report this event to all other nodes.

What comes to mind is that IP's set for interconnection (in lucci conf) 
must not be public IP's but of that separate/secundary network in order 
for main node to be able to contact the rest of the nodes.

I hope this helps.

Ljubomir