Hi,
For a testing purposes I'm trying to create two node HA environment for running some service (openvpn and haproxy). I installed two CentOS 6.4 KVM guests.
I was able to create a cluster and some resources. I followed the document https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/...
But my cluster behaves not as expected: After start of cluster sw on both nodes, they can see each other. ---------------------------------------- [root@lb1 ~]# pcs status Cluster name: LB.STK Last updated: Mon Jan 13 15:34:21 2014 Last change: Mon Jan 13 15:24:47 2014 via cibadmin on lb1.asol.local Stack: cman Current DC: lb1.asol.local - partition with quorum Version: 1.1.10-14.el6_5.1-368c726 2 Nodes configured 2 Resources configured
Online: [ lb1.asol.local lb2.asol.local ]
Full list of resources:
Resource Group: LB LAN.VIP (ocf::heartbeat:IPaddr2): Started lb2.asol.local WAN.VIP (ocf::heartbeat:IPaddr2): Started lb2.asol.local ---------------------------------------- After manual shutdown of one node 2 (pcs cluster stop), the node 1 doesn't get this information and still believes node 2 is up and running. In the log of corosync @lb2 these lines are repeating:
Jan 13 15:38:43 [1712] lb2.asol.local cib: info: crm_client_new: Connecting 0x25a3810 for uid=0 gid=0 pid=10763 id=2b06a195-11f6-452d-992b-5ea0c69be21a Jan 13 15:38:43 [1712] lb2.asol.local cib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crm_resource/2, version=0.7.4) Jan 13 15:38:43 [1712] lb2.asol.local cib: info: crm_client_destroy: Destroying 0 events Jan 13 17:24:24 corosync [TOTEM ] Retransmit List: 9a 9b 9c
The firewall on both nodes is open for incomming traffic from these nodes and stonith-enabled is set to false. I created keys for root user, so I can make ssh back and forth without using password. The pacemaker's version is 1.1.10-14.
Do you have any idea, where might be a problem?
thanks
martin