On 29/10/14 09:33 AM, aditya hilman wrote:
Oct 29 13:15:30 web2 fenced[1548]: fenced 3.0.12.1 started Oct 29 13:15:30 web2 fenced[1548]: fenced 3.0.12.1 started Oct 29 13:15:30 web2 dlm_controld[1568]: dlm_controld 3.0.12.1 started Oct 29 13:15:30 web2 dlm_controld[1568]: dlm_controld 3.0.12.1 started Oct 29 13:15:30 web2 gfs_controld[1621]: gfs_controld 3.0.12.1 started Oct 29 13:15:30 web2 gfs_controld[1621]: gfs_controld 3.0.12.1 started Oct 29 13:16:21 web2 fenced[1548]: fencing node web3.cluster Oct 29 13:16:21 web2 fenced[1548]: fencing node web3.cluster Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster dev 0.0 agent fence_rhevm result: error from agent Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster dev 0.0 agent fence_rhevm result: error from agent Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster failed Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster failed Oct 29 13:16:27 web2 fenced[1548]: fencing node web3.cluster Oct 29 13:16:27 web2 fenced[1548]: fencing node web3.cluster Oct 29 13:16:29 web2 fenced[1548]: fence web3.cluster success Oct 29 13:16:29 web2 fenced[1548]: fence web3.cluster success
It didn't see the other node on boot, gave up and fenced the peer, it seems. The fence call failed before it succeeded, another sign of a general network issue.
As an aside, did you configure corosync.conf? If so, don't. Let cman handle everything.
Are you starting cman on both nodes at (close to) exactly the same time?