[CentOS] Corosync init-script broken on CentOS6

Wed Nov 23 19:24:05 UTC 2011
Michel van Deventer <michel at van.deventer.cx>

Hi,

Did you configure corosync ? 
Normally corosync starts pacemaker, which in turn starts the heartbeat
deamons.
But you have to configure the latter using for example a pcmk file with
configuration in /etc/corosync/conf.d/ (from the top of my head).
I normally use :
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/

	Regards,

        Michel

On Wed, 2011-11-23 at 12:16 -0500, Hal Martin wrote:
> Hello all,
> 
> I am trying to create a corosync/pacemaker cluster using CentOS 6.0.
> However, I'm having a great deal of difficulty doing so.
> 
> Corosync has a valid configuration file and an authkey has been generated.
> 
> When I run /etc/init.d/corosync I see that only corosync is started.
> >From experience working with corosync/pacemaker before, I know that
> this is not enough to have a functioning cluster. For some reason the
> base install (with or without updates) is not starting corosync
> dependencies.
> 
> I've even tried using corosync/pacemaker for the EPEL 6 repo, and
> still the init-script will not start corosync dependencies.
> 
> Expected:
> corosync
> /usr/lib64/heartbeat/stonithd
> /usr/lib64/heartbeat/cib
> /usr/lib64/heartbeat/lrmd
> /usr/lib64/heartbeat/attrd
> /usr/lib64/heartbeat/pengine
> 
> Observed:
> corosync
> 
> My install options are:
> %packages
> @base
> @core
> @ha
> @nfs-file-server
> @network-file-system-client
> @resilient-storage
> @server-platform
> @server-policy
> @storage-client-multipath
> @system-admin-tools
> pax
> oddjob
> sgpio
> pacemaker
> dlm-pcmk
> screen
> lsscsi
> -rgmanager
> %end
> 
> The logs from the server aren't terribly helpful either:
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [pcmk  ] info: spawn_child:
> Forked child 2515 for process stonith-ng
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [pcmk  ] info: spawn_child:
> Forked child 2516 for process cib
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [pcmk  ] info: spawn_child:
> Forked child 2517 for process lrmd
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [pcmk  ] info: spawn_child:
> Forked child 2518 for process attrd
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [pcmk  ] info: spawn_child:
> Forked child 2519 for process pengine
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [pcmk  ] info: spawn_child:
> Forked child 2520 for process crmd
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [SERV  ] Service engine
> loaded: Pacemaker Cluster Manager 1.1.2
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [SERV  ] Service engine
> loaded: corosync extended virtual synchrony service
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [SERV  ] Service engine
> loaded: corosync configuration service
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [SERV  ] Service engine
> loaded: corosync cluster closed process group service v1.01
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [SERV  ] Service engine
> loaded: corosync cluster config database access v1.01
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [SERV  ] Service engine
> loaded: corosync profile loading service
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [SERV  ] Service engine
> loaded: corosync cluster quorum service v0.1
> Nov 23 12:13:45 cheapo4 corosync[2509]:   [MAIN  ] Compatibility mode
> set to whitetank.  Using V1 and V2 of the synchronization engine.
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] ERROR:
> pcmk_wait_dispatch: Child process lrmd exited (pid=2517, rc=100)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] notice:
> pcmk_wait_dispatch: Child process lrmd no longer wishes to be
> respawned
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] info:
> update_member: Node cheapo4.jrz.cbn now has process list:
> 00000000000000000000000000111302 (1118978)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] ERROR:
> pcmk_wait_dispatch: Child process cib exited (pid=2516, rc=100)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] notice:
> pcmk_wait_dispatch: Child process cib no longer wishes to be respawned
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] info:
> update_member: Node cheapo4.jrz.cbn now has process list:
> 00000000000000000000000000111202 (1118722)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] ERROR:
> pcmk_wait_dispatch: Child process crmd exited (pid=2520, rc=100)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] notice:
> pcmk_wait_dispatch: Child process crmd no longer wishes to be
> respawned
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] info:
> update_member: Node cheapo4.jrz.cbn now has process list:
> 00000000000000000000000000111002 (1118210)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] ERROR:
> pcmk_wait_dispatch: Child process attrd exited (pid=2518, rc=100)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] notice:
> pcmk_wait_dispatch: Child process attrd no longer wishes to be
> respawned
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] info:
> update_member: Node cheapo4.jrz.cbn now has process list:
> 00000000000000000000000000110002 (1114114)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] ERROR:
> pcmk_wait_dispatch: Child process pengine exited (pid=2519, rc=100)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] notice:
> pcmk_wait_dispatch: Child process pengine no longer wishes to be
> respawned
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] info:
> update_member: Node cheapo4.jrz.cbn now has process list:
> 00000000000000000000000000100002 (1048578)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] ERROR:
> pcmk_wait_dispatch: Child process stonith-ng exited (pid=2515, rc=100)
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] notice:
> pcmk_wait_dispatch: Child process stonith-ng no longer wishes to be
> respawned
> Nov 23 12:13:46 cheapo4 corosync[2509]:   [pcmk  ] info:
> update_member: Node cheapo4.jrz.cbn now has process list:
> 00000000000000000000000000000002 (2)
> 
> Googling suggests this is the result of mis-matched versions of
> pacemaker/corosync; but this is the base install and the problem
> persists even when I install it from EPEL6 repos.
> 
> Is anyone else experiencing difficulties with corosync/pacemaker on CentOS6?
> 
> Thanks,
> -Hal
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos