Hello Dominic,
Thanks for the response.
when I start cman manually, web3 fenced by web2. Here the logs : web2 : /var/log/messages/
Oct 29 13:15:25 web2 corosync[1493]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service. Oct 29 13:15:25 web2 corosync[1493]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service. Oct 29 13:15:25 web2 corosync[1493]: [MAIN ] Corosync built-in features: nss dbus rdma snmp Oct 29 13:15:25 web2 corosync[1493]: [MAIN ] Corosync built-in features: nss dbus rdma snmp Oct 29 13:15:25 web2 corosync[1493]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Oct 29 13:15:25 web2 corosync[1493]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Oct 29 13:15:25 web2 corosync[1493]: [MAIN ] Successfully parsed cman config Oct 29 13:15:25 web2 corosync[1493]: [MAIN ] Successfully parsed cman config Oct 29 13:15:25 web2 corosync[1493]: [TOTEM ] Initializing transport (UDP/IP Multicast). Oct 29 13:15:25 web2 corosync[1493]: [TOTEM ] Initializing transport (UDP/IP Multicast). Oct 29 13:15:25 web2 corosync[1493]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Oct 29 13:15:25 web2 corosync[1493]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Oct 29 13:15:26 web2 corosync[1493]: [TOTEM ] The network interface [10.32.6.153] is now up. Oct 29 13:15:26 web2 corosync[1493]: [TOTEM ] The network interface [10.32.6.153] is now up. Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] Using quorum provider quorum_cman Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] Using quorum provider quorum_cman Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 13:15:26 web2 corosync[1493]: [CMAN ] CMAN 3.0.12.1 (built Sep 25 2014 15:07:47) started Oct 29 13:15:26 web2 corosync[1493]: [CMAN ] CMAN 3.0.12.1 (built Sep 25 2014 15:07:47) started Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync configuration service Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync configuration service Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync profile loading service Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync profile loading service Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] Using quorum provider quorum_cman Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] Using quorum provider quorum_cman Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 13:15:26 web2 corosync[1493]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 13:15:26 web2 corosync[1493]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Oct 29 13:15:26 web2 corosync[1493]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Oct 29 13:15:26 web2 corosync[1493]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 29 13:15:26 web2 corosync[1493]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 29 13:15:26 web2 corosync[1493]: [CMAN ] quorum regained, resuming activity Oct 29 13:15:26 web2 corosync[1493]: [CMAN ] quorum regained, resuming activity Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] This node is within the primary component and will provide service. Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] This node is within the primary component and will provide service. Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] Members[1]: 1 Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] Members[1]: 1 Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] Members[1]: 1 Oct 29 13:15:26 web2 corosync[1493]: [QUORUM] Members[1]: 1 Oct 29 13:15:26 web2 corosync[1493]: [CPG ] chosen downlist: sender r(0) ip(10.32.6.153) ; members(old:0 left:0) Oct 29 13:15:26 web2 corosync[1493]: [CPG ] chosen downlist: sender r(0) ip(10.32.6.153) ; members(old:0 left:0) Oct 29 13:15:26 web2 corosync[1493]: [MAIN ] Completed service synchronization, ready to provide service. Oct 29 13:15:26 web2 corosync[1493]: [MAIN ] Completed service synchronization, ready to provide service. Oct 29 13:15:30 web2 fenced[1548]: fenced 3.0.12.1 started Oct 29 13:15:30 web2 fenced[1548]: fenced 3.0.12.1 started Oct 29 13:15:30 web2 dlm_controld[1568]: dlm_controld 3.0.12.1 started Oct 29 13:15:30 web2 dlm_controld[1568]: dlm_controld 3.0.12.1 started Oct 29 13:15:30 web2 gfs_controld[1621]: gfs_controld 3.0.12.1 started Oct 29 13:15:30 web2 gfs_controld[1621]: gfs_controld 3.0.12.1 started Oct 29 13:16:21 web2 fenced[1548]: fencing node web3.cluster Oct 29 13:16:21 web2 fenced[1548]: fencing node web3.cluster Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster dev 0.0 agent fence_rhevm result: error from agent Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster dev 0.0 agent fence_rhevm result: error from agent Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster failed Oct 29 13:16:24 web2 fenced[1548]: fence web3.cluster failed Oct 29 13:16:27 web2 fenced[1548]: fencing node web3.cluster Oct 29 13:16:27 web2 fenced[1548]: fencing node web3.cluster Oct 29 13:16:29 web2 fenced[1548]: fence web3.cluster success Oct 29 13:16:29 web2 fenced[1548]: fence web3.cluster success
--- web3 : /var/log/messages Oct 29 13:15:26 web3 corosync[1526]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provi de service. Oct 29 13:15:26 web3 corosync[1526]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provi de service. Oct 29 13:15:26 web3 corosync[1526]: [MAIN ] Corosync built-in features: nss dbus rdma snmp Oct 29 13:15:26 web3 corosync[1526]: [MAIN ] Corosync built-in features: nss dbus rdma snmp Oct 29 13:15:26 web3 corosync[1526]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Oct 29 13:15:26 web3 corosync[1526]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Oct 29 13:15:26 web3 corosync[1526]: [MAIN ] Successfully parsed cman config Oct 29 13:15:26 web3 corosync[1526]: [MAIN ] Successfully parsed cman config Oct 29 13:15:26 web3 corosync[1526]: [TOTEM ] Initializing transport (UDP/IP Multicast). Oct 29 13:15:26 web3 corosync[1526]: [TOTEM ] Initializing transport (UDP/IP Multicast). Oct 29 13:15:26 web3 corosync[1526]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/ SHA1HMAC (mode 0). Oct 29 13:15:26 web3 corosync[1526]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/ SHA1HMAC (mode 0). Oct 29 13:15:26 web3 corosync[1526]: [TOTEM ] The network interface [10.32.6.194] is now up. Oct 29 13:15:26 web3 corosync[1526]: [TOTEM ] The network interface [10.32.6.194] is now up. Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] Using quorum provider quorum_cman Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] Using quorum provider quorum_cman Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 13:15:27 web3 corosync[1526]: [CMAN ] CMAN 3.0.12.1 (built Sep 25 2014 15:07:47) started Oct 29 13:15:27 web3 corosync[1526]: [CMAN ] CMAN 3.0.12.1 (built Sep 25 2014 15:07:47) started Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync extended virtual synchrony se rvice Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync extended virtual synchrony se rvice Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync configuration service Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync configuration service Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync cluster config database acces s v1.01 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync cluster config database acces s v1.01 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync profile loading service Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync profile loading service Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] Using quorum provider quorum_cman Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] Using quorum provider quorum_cman Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 13:15:27 web3 corosync[1526]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Oct 29 13:15:27 web3 corosync[1526]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Oct 29 13:15:27 web3 corosync[1526]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Oct 29 13:15:27 web3 corosync[1526]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 29 13:15:27 web3 corosync[1526]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 29 13:15:27 web3 corosync[1526]: [CMAN ] quorum regained, resuming activity Oct 29 13:15:27 web3 corosync[1526]: [CMAN ] quorum regained, resuming activity Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] This node is within the primary component and will provide service. Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] This node is within the primary component and will provide service. Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] Members[1]: 2 Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] Members[1]: 2 Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] Members[1]: 2 Oct 29 13:15:27 web3 corosync[1526]: [QUORUM] Members[1]: 2 Oct 29 13:15:27 web3 corosync[1526]: [CPG ] chosen downlist: sender r(0) ip(10.32.6.194) ; members(old:0 left:0) Oct 29 13:15:27 web3 corosync[1526]: [CPG ] chosen downlist: sender r(0) ip(10.32.6.194) ; members(old:0 left:0) Oct 29 13:15:27 web3 corosync[1526]: [MAIN ] Completed service synchronization, ready to provide service. Oct 29 13:15:27 web3 corosync[1526]: [MAIN ] Completed service synchronization, ready to provide service. Oct 29 13:15:30 web3 fenced[1582]: fenced 3.0.12.1 started Oct 29 13:15:30 web3 fenced[1582]: fenced 3.0.12.1 started Oct 29 13:15:31 web3 dlm_controld[1608]: dlm_controld 3.0.12.1 started Oct 29 13:15:31 web3 dlm_controld[1608]: dlm_controld 3.0.12.1 started Oct 29 13:15:31 web3 gfs_controld[1655]: gfs_controld 3.0.12.1 started Oct 29 13:15:31 web3 gfs_controld[1655]: gfs_controld 3.0.12.1 started Oct 29 13:14:54 web3 kernel: : Events: unsupported p6 CPU model 44 no PMU driver, software events only. Oct 29 13:14:54 web3 kernel: : Events: unsupported p6 CPU model 44 no PMU driver, software events only. Oct 29 13:14:54 web3 kernel: : NMI watchdog disabled (cpu0): hardware events not enabled Oct 29 13:14:54 web3 kernel: : NMI watchdog disabled (cpu0): hardware events not enabled Oct 29 13:14:54 web3 kernel: : Brought up 1 CPUs Oct 29 13:14:54 web3 kernel: : Brought up 1 CPUs
Thanks.
On Wed, Oct 29, 2014 at 5:14 PM, Dominic Geevarghese share2dom@gmail.com wrote:
Hi,
Does anybody know how to solving this "fence loop" ?
master_wins="1" is not working properly, qdisk also.
Logs shared are not sufficient to identify the cause of fence loop. I would suggest you to
- Disable cman - chkconfig cman off ( and rgmanager also if you wish ) -
on both the nodes . 2. Reboot both the nodes 3. Once the machine is up, open two terminals 4. Start cman manually on both the nodes 5. share the behaviour and logs generated.
Cheers, _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos