Hello Digimer, i'm already configured cluster.conf like your advice, but when start cman manually on web3 ( cman already stopped ), web2 fenced by web3. Here the log on web3 : Oct 29 14:38:42 web3 ricci[2557]: Executing '/usr/bin/virsh nodeinfo' Oct 29 14:38:42 web3 ricci[2557]: Executing '/usr/bin/virsh nodeinfo' Oct 29 14:38:42 web3 ricci[2559]: Executing '/usr/libexec/ricci/ricci-worker -f /var/lib/ricci/queue/1604501608' Oct 29 14:38:42 web3 ricci[2559]: Executing '/usr/libexec/ricci/ricci-worker -f /var/lib/ricci/queue/1604501608' Oct 29 14:38:42 web3 modcluster: Updating cluster.conf Oct 29 14:38:42 web3 modcluster: Updating cluster.conf Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service. Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service. Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Corosync built-in features: nss dbus rdma snmp Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Corosync built-in features: nss dbus rdma snmp Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Successfully parsed cman config Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Successfully parsed cman config Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] Initializing transport (UDP/IP Unicast). Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] Initializing transport (UDP/IP Unicast). Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] The network interface [10.32.6.194] is now up. Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] The network interface [10.32.6.194] is now up. Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] Using quorum provider quorum_cman Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] Using quorum provider quorum_cman Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 14:39:05 web3 corosync[2651]: [CMAN ] CMAN 3.0.12.1 (built Sep 25 2014 15:07:47) started Oct 29 14:39:05 web3 corosync[2651]: [CMAN ] CMAN 3.0.12.1 (built Sep 25 2014 15:07:47) started Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync configuration service Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync configuration service Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync profile loading service Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync profile loading service Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] Using quorum provider quorum_cman Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] Using quorum provider quorum_cman Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 14:39:05 web3 corosync[2651]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] adding new UDPU member {10.32.6.153} Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] adding new UDPU member {10.32.6.153} Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] adding new UDPU member {10.32.6.194} Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] adding new UDPU member {10.32.6.194} Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 29 14:39:05 web3 corosync[2651]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 29 14:39:05 web3 corosync[2651]: [CMAN ] quorum regained, resuming activity Oct 29 14:39:05 web3 corosync[2651]: [CMAN ] quorum regained, resuming activity Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] This node is within the primary component and will provide service. Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] This node is within the primary component and will provide service. Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] Members[1]: 2 Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] Members[1]: 2 Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] Members[1]: 2 Oct 29 14:39:05 web3 corosync[2651]: [QUORUM] Members[1]: 2 Oct 29 14:39:05 web3 corosync[2651]: [CPG ] chosen downlist: sender r(0) ip(10.32.6.194) ; members(old:0 left:0) Oct 29 14:39:05 web3 corosync[2651]: [CPG ] chosen downlist: sender r(0) ip(10.32.6.194) ; members(old:0 left:0) Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Completed service synchronization, ready to provide service. Oct 29 14:39:05 web3 corosync[2651]: [MAIN ] Completed service synchronization, ready to provide service. Oct 29 14:39:09 web3 fenced[2708]: fenced 3.0.12.1 started Oct 29 14:39:09 web3 fenced[2708]: fenced 3.0.12.1 started Oct 29 14:39:09 web3 dlm_controld[2734]: dlm_controld 3.0.12.1 started Oct 29 14:39:09 web3 dlm_controld[2734]: dlm_controld 3.0.12.1 started Oct 29 14:39:09 web3 gfs_controld[2781]: gfs_controld 3.0.12.1 started Oct 29 14:39:09 web3 gfs_controld[2781]: gfs_controld 3.0.12.1 started Oct 29 14:40:24 web3 fenced[2708]: fencing node web2.cluster Oct 29 14:40:24 web3 fenced[2708]: fencing node web2.cluster Oct 29 14:40:29 web3 fenced[2708]: fence web2.cluster success Oct 29 14:40:29 web3 fenced[2708]: fence web2.cluster success I'm not configure corosync.conf cluster.conf : <?xml version="1.0"?> <cluster config_version="8" name="web-cluster"> <clusternodes> <clusternode name="web2.cluster" nodeid="1"> <fence> <method name="fence-web2"> <device name="fence-rhevm" port="web2.cluster"/> </method> </fence> </clusternode> <clusternode name="web3.cluster" nodeid="2"> <fence> <method name="fence-web3"> <device name="fence-rhevm" port="web3.cluster"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="1" transport="udpu" two_node="1"/> <fencedevices> <fencedevice agent="fence_rhevm" ipaddr="192.168.1.1" login="admin at internal" name="fence-rhevm" passwd="secret" ssl="on"/> </fencedevices> <fence_daemon post_join_delay="30"/> </cluster> Thanks On Wed, Oct 29, 2014 at 9:33 PM, Digimer <lists at alteeve.ca> wrote: > On 29/10/14 09:33 AM, aditya hilman wrote: > >> Oct 29 13:15:30 web2 fenced[1548]: fenced 3.0.12.1 started >> Oct 29 13:15:30 web2 fenced[1548]: fenced 3.0.12.1 started >> Oct 29 13:15:30 web2 dlm_controld[1568]: dlm_controld 3.0.12.1 started >> Oct 29 13:15:30 web2 dlm_controld[1568]: dlm_controld 3.0.12.1 started >> Oct 29 13:15:30 web2 gfs_controld[1621]: gfs_controld 3.0.12.1 started >> Oct 29 13:15:30 web2 gfs_controld[1621]: gfs_controld 3.0.12.1 started >> Oct 29 13:16:21 web2 fenced[1548]: fencing node web3.cluster >> Oct 29 13:16:21 web2 fenced[1548]: fencing node web3.cluster >> Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster dev 0.0 agent >> fence_rhevm result: error from agent >> Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster dev 0.0 agent >> fence_rhevm result: error from agent >> Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster failed >> Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster failed >> Oct 29 13:16:27 web2 fenced[1548]: fencing node web3.cluster >> Oct 29 13:16:27 web2 fenced[1548]: fencing node web3.cluster >> Oct 29 13:16:29 web2 fenced[1548]: fence web3.cluster success >> Oct 29 13:16:29 web2 fenced[1548]: fence web3.cluster success >> > > It didn't see the other node on boot, gave up and fenced the peer, it > seems. The fence call failed before it succeeded, another sign of a general > network issue. > > As an aside, did you configure corosync.conf? If so, don't. Let cman > handle everything. > > Are you starting cman on both nodes at (close to) exactly the same time? > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos > -- Regards, Adit http://a <http://simplyaddo.web.id>dityahilman.com http://id.linkedin.com/in/adityahilman ym : science2rule