Dear List,
I have one last little problem with setting up an cluster. My gfs
Mount will hang as soon as I do an iptables restart on one of the
nodes..
First, let me describe my setup:
- 4 nodes, all running an updated Centos 5.2 installation
- 1 Dell MD3000i ISCSI SAN
- All nodes are connected by Dell?s Supplied RDAC driver
Everything is running stable when the cluster is started (tested for a
week or so), when I did some last changes in my firewall and did an
iptables restart it all went down after a while. I reproduced the
issue now several times so I am quite sure it has to do with the
iptables restart.
I have a custom fence script for our ipoman powerswitch, which is all
tested and is working fine.
When I do iptables restart the following will happen:
- After approx 10 seconds the process gfs_controld will go to 100% cpu
usage (at all nodes!)
- I can still access my gfs Mount
- The Group_tool dump gfs tells me:
----------------------
1234541723 config_no_withdraw 0
1234541723 config_no_plock 0
1234541723 config_plock_rate_limit 100
1234541723 config_plock_ownership 0
1234541723 config_drop_resources_time 10000
1234541723 config_drop_resources_count 10
1234541723 config_drop_resources_age 10000
1234541723 protocol 1.0.0
1234541723 listen 1
1234541723 cpg 5
1234541723 groupd 6
1234541723 uevent 7
1234541723 plocks 10
1234541723 plock cpg message size: 336 bytes
1234541723 setup done
1234541737 client 6: join /setan gfs lock_dlm mars:setan rw
/dev/mapper/vg_cluster-lv_cluster
1234541737 mount: /setan gfs lock_dlm mars:setan rw
/dev/mapper/vg_cluster-lv_cluster
1234541737 setan cluster name matches: mars
1234541737 setan do_mount: rv 0
1234541737 groupd cb: set_id setan 20004
1234541737 groupd cb: start setan type 2 count 4 members 3 2 1 4
1234541737 setan start 3 init 1 type 2 member_count 4
1234541737 setan add member 3
1234541737 setan add member 2
1234541737 setan add member 1
1234541737 setan add member 4
1234541737 setan total members 4 master_nodeid -1 prev -1
1234541737 setan start_participant_init
1234541737 setan send_options len 1296 "rw"
1234541737 setan start_done 3
1234541737 setan receive_options from 3 len 1296 last_cb 2
1234541737 setan receive_journals from 1 to 3 len 320 count 4 cb 2
1234541737 receive nodeid 1 jid 1 opts 1
1234541737 receive nodeid 2 jid 2 opts 1
1234541737 receive nodeid 3 jid 3 opts 1
1234541737 receive nodeid 4 jid 0 opts 1
1234541737 setan received_our_jid 3
1234541737 setan retrieve_plocks
1234541737 notify_mount_client: nodir not found for lockspace setan
1234541737 notify_mount_client: ccs_disconnect
1234541737 notify_mount_client: hostdata=jid=3:id=131076:first=0
1234541737 groupd cb: finish setan
1234541737 setan finish 3 needs_recovery 0
1234541737 setan set /sys/fs/gfs/mars:setan/lock_module/block to 0
1234541737 setan set open /sys/fs/gfs/mars:setan/lock_module/block error -1 2
1234541737 kernel: add@ mars:setan
1234541737 setan ping_kernel_mount 0
1234541738 kernel: change@ mars:setan
1234541738 setan recovery_done jid 3 ignored, first 0,0
1234541738 client 6: mount_result /setan gfs 0
1234541738 setan got_mount_result: ci 6 result 0 another 0
first_mounter 0 opts 1
1234541738 setan send_mount_status kernel_mount_error 0 first_mounter 0
1234541738 client 6 fd 11 dead
1234541738 setan receive_mount_status from 3 len 288 last_cb 3
1234541738 setan _receive_mount_status from 3 kernel_mount_error 0
first_mounter 0 opts 1
1234541925 client 6: dump
1234542420 client 6: dump
1234542420 client 7 fd 11 read error -1 9
1234542424 client 6: dump
1234542424 client 7 fd 11 read error -1 9
1234542424 client 8 fd 11 read error -1 9
1234542425 client 6: dump
1234542425 client 7 fd 11 read error -1 9
1234542425 client 8 fd 11 read error -1 9
1234542425 client 9 fd 11 read error -1 9
1234542426 client 6: dump
1234542426 client 7 fd 11 read error -1 9
1234542426 client 8 fd 11 read error -1 9
1234542426 client 9 fd 11 read error -1 9
1234542426 client 10 fd 11 read error -1 9
1234542427 client 6: dump
1234542427 client 7 fd 11 read error -1 9
1234542427 client 8 fd 11 read error -1 9
1234542427 client 9 fd 11 read error -1 9
1234542427 client 10 fd 11 read error -1 9
1234542427 client 11 fd 11 read error -1 9
1234542428 client 6: dump
----------------------
- After a while the groupd process will hit 100% as well (on all nodes)
- The gfs Mount will be inaccessible after a while, it hangs when
trying to open it.
- Group_tool still shows that all nodes are participating in the
cluster and gfs service, but no problems are reported..
Does anyone has a clue to fix this completly or at least how to
recover my system when it happens without a full reboot of the
complete cluster? I have tried for a lot of hours and im still very
new to clustering, im just testing it before I want to use it in
production enviroments.
I really appreciate any help!
Regards,
Sven
Config files/settings:
---------------------------
[root@badjak ~]# uname -a
Linux badjak.somedomain.tld 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16
11:57:43 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
----------------------------
/etc/cluster/cluster.conf:
------------------------------------
<?xml version="1.0"?>
<cluster alias="mars" config_version="77" name="mars">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
<clusternodes>
<clusternode name="gandaria.somedomain.tld"
nodeid="1" votes="1">
<fence>
<method name="1">
<device name="ipoman"
action="Off" switch="volt" port="3"/>
<device name="ipoman"
action="Off" switch="ampere" port="9"/>
<device name="ipoman"
action="On" switch="volt" port="3"/>
<device name="ipoman"
action="On" switch="ampere" port="9"/>
</method>
</fence>
</clusternode>
<clusternode name="goreng.somedomain.tld" nodeid="2"
votes="1">
<fence>
<method name="1">
<device name="ipoman"
action="Off" switch="volt" port="4"/>
<device name="ipoman"
action="Off" switch="ampere" port="10"/>
<device name="ipoman"
action="On" switch="volt" port="4"/>
<device name="ipoman"
action="On" switch="ampere" port="10"/>
</method>
</fence>
</clusternode>
<clusternode name="brandal.somedomain.tld" nodeid="4"
votes="1">
<fence>
<method name="1">
<device name="ipoman"
action="Off" switch="volt" port="9"/>
<device name="ipoman"
action="Off" switch="ampere" port="3"/>
<device name="ipoman"
action="On" switch="volt" port="9"/>
<device name="ipoman"
action="On" switch="ampere" port="3"/>
</method>
</fence>
</clusternode>
<clusternode name="badjak.somedomain.tld" nodeid="3"
votes="1">
<fence>
<method name="1">
<device name="ipoman"
action="Off" switch="volt" port="10"/>
<device name="ipoman"
action="Off" switch="ampere" port="4"/>
<device name="ipoman"
action="On" switch="volt" port="10"/>
<device name="ipoman"
action="On" switch="ampere" port="4"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_ipoman" name="ipoman"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
------------------------------------
[root@badjak /]# cat /etc/fstab
/dev/VolGroup00/LogVol00 / ext3 defaults 1 1
LABEL=/boot /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
/dev/VolGroup00/LogVol01 swap swap defaults 0 0
/dev/vg_cluster/lv_cluster /setan gfs defaults 0 0
------------------------------------
[root@badjak ~]# group_tool
type level name id state
fence 0 default 00010001 none
[1 2 3 4]
dlm 1 clvmd 00010004 none
[1 2 3 4]
dlm 1 setan 00030004 none
[1 2 3 4]
dlm 1 rgmanager 00040004 none
[1 2 3 4]
gfs 2 setan 00020004 none
[1 2 3 4]
------------------------------------
[root@badjak ~]# clustat
Cluster Status for mars @ Fri Feb 13 17:18:58 2009
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
gandaria.somedomain.tld
1 Online
goreng.somedomain.tld
2 Online
badjak.somedomain.tld
3 Online, Local
brandal.somedomain.tld
4 Online
------------------------------------
[root@badjak ~]# SMdevices
PowerVault Modular Disk Storage Manager Devices, Version 09.17.A6.01
Built Tue Mar 20 15:31:11 CST 2007
Copyright 2005-2006 Dell Inc. All rights reserved. Use is subject to
license terms
/dev/sdb (/dev/sg3) [Storage Array setan, Virtual Disk 1, LUN 0,
Virtual Disk ID <6001ec9000f2dc860000043448bf7e20>, Preferred Path
(Controller-1): In Use]
------------------------------------
[root@badjak ~]# /etc/init.d/clvmd status
clvmd (pid 7433) is running...
active volumes: LogVol00 LogVol01 lv_cluster
------------------------------------