[CentOS] GFS + Restarting iptables

Wed Feb 18 10:00:47 UTC 2009
Sven Kaptein | MARS websolutions <info at marswebsolutions.nl>

>>>> Dear List,
>>>>
>>>> I have one last little problem with setting up an cluster. My gfs  
>>>> Mount will hang as soon as I do an iptables restart on one of the  
>>>> nodes..
>> 
>>> Undoubtedly someone else with more experience with GFS will give you an
>>> answer, but to me this makes me think ip_conntrack stuff gets cleared
>>> out and sessions have to reestablish themselves.
>>>
>>> Ray
>> 
>> Ray,
>> 
>> Thanks for your fast answer and getting me into the right direction. This
>> sounds like a possible solution, but I have no clue how to fix it. I
googled
>> already a lot on ip_conntrack + gfs, but don't see a possible solution
>> coming up.
>> 
>> Can someone/you please help me a little bit more with the issue?
>> 
>> Thanks a lot!
>> Sven
> Are your GFS mounts and your cluster on different sides of the firewall?
>
> Maybe you can do something simple like a tunnel between the clusters and
the
> mounts. Should be easier and safer than punching holes in the firewall. Or
put
> a separate subnet or vlan just for the GFS traffic.
Uhm... I have the cluster running on a different vlan then my ISCSI traffic.
Is that a problem? I would not like to put my cluster communication on the
other vlan since thats iscsi dedicated now.

I have now figured out that it isn't the restarting IP tables causing the
trouble, but the issue is as follows:

- Calling Group_tool dump
This will cause groupd to run at 100% cpu. Dooing a strace on this process
tells me its dooing some poll in an infinite loop:
poll([{fd=1, events=POLLIN}, {fd=2, events=POLLIN}, {fd=7, events=POLLIN},
{fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=10, events=POLLIN},
{fd=12, events=POLLIN}, {fd=14, events=POLLIN}, {fd=18, events=POLLIN},
{fd=17, events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN,
revents=POLLNVAL}, {fd=-1}], 13, -1) = 1

- Calling Group_tool dump gfs
This will cause gfs_controld to run at 100% cpu. Exactly the same as the
groupd (strace):
poll([{fd=2, events=POLLIN}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN},
{fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=11, events=POLLIN},
{fd=12, events=POLLIN, revents=POLLNVAL}, {fd=12, events=POLLIN,
revents=POLLNVAL}], 8, -1) = 2

Sometimes my Mount will hang, but sometimes it will just continue normally.
I can imagine this has to do with the amount of data going to the gfs Mount.

So I guess it isn't really an iptables problem.. But im trying to debug that
a little bit more as well..

Any clues?

Thanks!!
Sven