[CentOS-virt] GFS2 hangs after one node going down

Thu Mar 21 17:48:12 UTC 2013
Maurizio Giungato <m.giungato at pixnamic.com>

Il 21/03/2013 18:14, Digimer ha scritto:
> On 03/21/2013 01:11 PM, Maurizio Giungato wrote:
>> Hi guys,
>>
>> my goal is to create a reliable virtualization environment using CentOS
>> 6.4 and KVM, I've three nodes and a clustered GFS2.
>>
>> The enviroment is up and working, but I'm worry for the reliability, if
>> I turn the network interface down on one node to simulate a crash (for
>> example on the node "node6.blade"):
>>
>> 1) GFS2 hangs (processes go in D state) until node6.blade get fenced
>> 2) not only node6.blade get fenced, but also node5.blade!
>>
>> Help me to save my last neurons!
>>
>> Thanks
>> Maurizio
>
> DLM, the distributed lock manager provided by the cluster, is designed 
> to block when a known goes into an unknown state. It does not unblock 
> until that node is confirmed to be fenced. This is by design. GFS2, 
> rgmanager and clustered LVM all use DLM, so they will all block as well.
>
> As for why two nodes get fenced, you will need to share more about 
> your configuration.
>
My configuration is very simple I attached cluster.conf and hosts files.
This is the row I added in /etc/fstab:
/dev/mapper/KVM_IMAGES-VL_KVM_IMAGES /var/lib/libvirt/images gfs2 
defaults,noatime,nodiratime 0 0

I set also fallback_to_local_locking = 0 in lvm.conf (but nothing change)

PS: I had two virtualization enviroments working like a charm on OCFS2, 
but since Centos 6.x I'm not able to install it, there is same way to 
achieve the same results with GFS2 (with GFS2 sometime I've a crash 
after only a "service network restart" [I've many interfaces then this 
operation takes more than 10 seconds], with OCFS2 I've never had this 
problem.

Thanks












-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf
Type: text/xml
Size: 879 bytes
Desc: not available
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20130321/5e46b901/attachment-0006.xml>
-------------- next part --------------
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

20.11.11.104    lama4.blade
20.11.11.105    lama5.blade
20.11.11.106    lama6.blade

20.11.12.104    lama4-fencing  lama4-fencing.blade
20.11.12.105    lama5-fencing  lama5-fencing.blade
20.11.12.106    lama6-fencing  lama6-fencing.blade