[CentOS] fencing nodes with drac under 5.9

Wed Mar 6 00:29:07 UTC 2013
Adam Wead <amsterdamos at gmail.com>

Turns out I spoke too soon.

Increasing the post_join_delay did at least allow me to restart the
cman+clvmd+gfs2+rgmanager services on each node, but if I reboot a node, it
will not rejoin the cluster.

If I start with both machines up, and cman stopped, I can start cman on
one, then then the other and they'll both join the cluster.  After that, I
start clvmd, gfs2 and rgmanager on one node then the other (all in that
order) and the gfs2 partition is mount on both nodes.

Now, if I reboot one of the other nodes, it will leave the cluster, but
when it comes back online, it starts up cman, and hangs forever on fencing.
After awhile, "dlm closing connection to node 0" and "dlm closing
connection to node 1" appear in the console and the system finishes boot
up.  At that point, it is not in the cluster.  I have to stop the cman
service (in the reverse order: rgmanager, gfs2, clvmd, cman) on both nodes
and then restart cman on both nodes, and proceed with the rest of the
services.

I should add, ricci is running on both nodes, but I'm not using luci and
configured the setup with system-config-cluster.

Anyway, I'd appreciate it if anyone could shed light on this.  I'm stumped
as to why this has changed in 5.9, but it could be just my ignorance of the
changes that were made with this latest release.

Many thanks,

...adam

On Tue, Mar 5, 2013 at 6:31 PM, Adam Wead <amsterdamos at gmail.com> wrote:

> Thanks for the response.  I just discovered the problem about 30' ago.
>  post_join_delay was set to the default of 3, meaning that it was only
> waiting 3 seconds for the node to join before fencing it.  Silly.  After
> changing that to 300 seconds, it worked fine.
>
> The config was this way with 5.8 and prior so why it wasn't an issue then,
> who can say.
>
> I also changed the fencing agent to fence_ipmilan, and configured the user
> on the DRAC card to be an "administrator" for IMPI.
>
> If it's any help to anyone, I've posted the working cluster.conf file.
>  You can also test your fencing for each drac:
>
> fence_ipmilan -a [drac IP] -l [drac user] -p [password for drac user] -o
> status
>
> ...adam
>
> ____________________________________________
> Adam Wead
> Systems and Digital Collections Librarian
> Rock and Roll Hall of Fame and Museum
> 216.515.1960 (t)
> 215.515.1964 (f)
>
>
> On Tue, Mar 5, 2013 at 5:57 PM, Joseph L. Casale <
> jcasale at activenetwerx.com> wrote:
>
>> > I have two Dell's which are both fenced via their DRAC6 cards.
>>
>>
>> Without your cluster config, we can only guess. Fencing w/ two nodes
>> requires specific startup config for this scenario. Given that, I presume
>> you
>> can find your issue, or post your conf.
>>
>>
>> jlc
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
>
>