[CentOS] CentOS 6.5 RHCS fence loops

Digimer lists at alteeve.ca
Wed Oct 29 14:33:14 UTC 2014


On 29/10/14 09:33 AM, aditya hilman wrote:
> Oct 29 13:15:30 web2 fenced[1548]: fenced 3.0.12.1 started
> Oct 29 13:15:30 web2 fenced[1548]: fenced 3.0.12.1 started
> Oct 29 13:15:30 web2 dlm_controld[1568]: dlm_controld 3.0.12.1 started
> Oct 29 13:15:30 web2 dlm_controld[1568]: dlm_controld 3.0.12.1 started
> Oct 29 13:15:30 web2 gfs_controld[1621]: gfs_controld 3.0.12.1 started
> Oct 29 13:15:30 web2 gfs_controld[1621]: gfs_controld 3.0.12.1 started
> Oct 29 13:16:21 web2 fenced[1548]: fencing node web3.cluster
> Oct 29 13:16:21 web2 fenced[1548]: fencing node web3.cluster
> Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster dev 0.0 agent
> fence_rhevm result: error from agent
> Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster dev 0.0 agent
> fence_rhevm result: error from agent
> Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster failed
> Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster failed
> Oct 29 13:16:27 web2 fenced[1548]: fencing node web3.cluster
> Oct 29 13:16:27 web2 fenced[1548]: fencing node web3.cluster
> Oct 29 13:16:29 web2 fenced[1548]: fence web3.cluster success
> Oct 29 13:16:29 web2 fenced[1548]: fence web3.cluster success

It didn't see the other node on boot, gave up and fenced the peer, it 
seems. The fence call failed before it succeeded, another sign of a 
general network issue.

As an aside, did you configure corosync.conf? If so, don't. Let cman 
handle everything.

Are you starting cman on both nodes at (close to) exactly the same time?

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?



More information about the CentOS mailing list