[CentOS-virt] GFS2 hangs after one node going down

Thu Mar 21 17:14:57 UTC 2013
Digimer <lists at alteeve.ca>

On 03/21/2013 01:11 PM, Maurizio Giungato wrote:
> Hi guys,
>
> my goal is to create a reliable virtualization environment using CentOS
> 6.4 and KVM, I've three nodes and a clustered GFS2.
>
> The enviroment is up and working, but I'm worry for the reliability, if
> I turn the network interface down on one node to simulate a crash (for
> example on the node "node6.blade"):
>
> 1) GFS2 hangs (processes go in D state) until node6.blade get fenced
> 2) not only node6.blade get fenced, but also node5.blade!
>
> Help me to save my last neurons!
>
> Thanks
> Maurizio

DLM, the distributed lock manager provided by the cluster, is designed 
to block when a known goes into an unknown state. It does not unblock 
until that node is confirmed to be fenced. This is by design. GFS2, 
rgmanager and clustered LVM all use DLM, so they will all block as well.

As for why two nodes get fenced, you will need to share more about your 
configuration.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?