Hi all,
I'm trying to setup a cluster of 2 machines with Centos 5.2 to host a postfix+spamassassin+clamav+mailscanner service. Below cluster software versions
rgmanager.i386 2.0.38-2.el5_2.1 installed cman.i386 2.0.84-2.el5_2.2 installed
Every machine (hp blade server ) has 4 interfaces, bounded in this way:
Eth0, eth1 -> bond0 -> connection for public service ( 10.0.181.x ) Eth2,eth3 -> bond1 -> connection for intra-cluster communication ( 192.168.44.x )
bond0 Link encap:Ethernet HWaddr 00:21:5A:48:DA:BE inet addr:10.0.181.41 Bcast:10.0.181.255 Mask:255.255.255.0 inet6 addr: fe80::221:5aff:fe48:dabe/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:85 errors:0 dropped:0 overruns:0 frame:0 TX packets:86 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:12963 (12.6 KiB) TX bytes:9144 (8.9 KiB)
bond1 Link encap:Ethernet HWaddr 00:1F:29:6D:7D:08 inet addr:192.168.44.41 Bcast:192.168.44.255 Mask:255.255.255.0 inet6 addr: fe80::21f:29ff:fe6d:7d08/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:29 errors:0 dropped:0 overruns:0 frame:0 TX packets:223 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:4612 (4.5 KiB) TX bytes:31746 (31.0 KiB)
Then I've created a new Mail service with these local resources: - Ip address 10.0.181.3 - Script /etc/rc.d/init.d/MailScanner - GFS file system on a SAN
Service start, but the problem is that, when I stop the service, external ip address is removed from bond0.
What may cause this ? I don't find any helpful information in /var/log/messages, what may I check to investigate ?
I've not setted any fence device, as I have configured GFS resource internally to the service ( GFS partition is mounted only from the server running the service ): is this a wrong design ? Has anyone already setup this kind of service on a Centos cluster ?
Many thanks in advance for any hints.
Fabio
Fabio Macchi wrote:
Hi all,
I’m trying to setup a cluster of 2 machines with Centos 5.2 to host a postfix+spamassassin+clamav+mailscanner service. Below cluster software versions
rgmanager.i386 2.0.38-2.el5_2.1 installed
cman.i386 2.0.84-2.el5_2.2 installed
Every machine (hp blade server ) has 4 interfaces, bounded in this way:
Eth0, eth1 -> bond0 -> connection for public service ( 10.0.181.x )
Eth2,eth3 -> bond1 -> connection for intra-cluster communication ( 192.168.44.x )
bond0 Link encap:Ethernet HWaddr 00:21:5A:48:DA:BE
inet addr:10.0.181.41 Bcast:10.0.181.255 Mask:255.255.255.0
inet6 addr: fe80::221:5aff:fe48:dabe/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:85 errors:0 dropped:0 overruns:0 frame:0
TX packets:86 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:12963 (12.6 KiB) TX bytes:9144 (8.9 KiB)
bond1 Link encap:Ethernet HWaddr 00:1F:29:6D:7D:08
inet addr:192.168.44.41 Bcast:192.168.44.255 Mask:255.255.255.0
inet6 addr: fe80::21f:29ff:fe6d:7d08/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:29 errors:0 dropped:0 overruns:0 frame:0
TX packets:223 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4612 (4.5 KiB) TX bytes:31746 (31.0 KiB)
Then I’ve created a new Mail service with these local resources:
Ip address 10.0.181.3
Script /etc/rc.d/init.d/MailScanner
GFS file system on a SAN
Service start, but the problem is that, when I stop the service, external ip address is removed from bond0.
Hi Fabio,
Could you please attach the following files:
/etc/sysconfig/network-scripts/ifcfg-bond0 /etc/sysconfig/network-scripts/ifcfg-bond1 /etc/sysconfig/network-scripts/ifcfg-eth0 /etc/sysconfig/network-scripts/ifcfg-eth1 /etc/sysconfig/network-scripts/ifcfg-eth2 /etc/sysconfig/network-scripts/ifcfg-eth3 /etc/cluster/cluster.conf
And "external ip address is removed from bond0." - I assume here external IP is 10.0.181.41, right?
Thanks Gowrishankar Rajaiyan | A Linux Fanatic.
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of A Linux Fanatic Sent: martedì 23 dicembre 2008 06:08 To: CentOS mailing list Subject: Re: [CentOS] cluster - ip address lost when service stopped
Fabio Macchi wrote:
Hi all,
I'm trying to setup a cluster of 2 machines with Centos 5.2 to host a postfix+spamassassin+clamav+mailscanner service. Below cluster software versions
rgmanager.i386 2.0.38-2.el5_2.1 installed
cman.i386 2.0.84-2.el5_2.2 installed
Every machine (hp blade server ) has 4 interfaces, bounded in this way:
Eth0, eth1 -> bond0 -> connection for public service ( 10.0.181.x )
Eth2,eth3 -> bond1 -> connection for intra-cluster communication ( 192.168.44.x )
bond0 Link encap:Ethernet HWaddr 00:21:5A:48:DA:BE
inet addr:10.0.181.41 Bcast:10.0.181.255 Mask:255.255.255.0
inet6 addr: fe80::221:5aff:fe48:dabe/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:85 errors:0 dropped:0 overruns:0 frame:0
TX packets:86 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:12963 (12.6 KiB) TX bytes:9144 (8.9 KiB)
bond1 Link encap:Ethernet HWaddr 00:1F:29:6D:7D:08
inet addr:192.168.44.41 Bcast:192.168.44.255 Mask:255.255.255.0
inet6 addr: fe80::21f:29ff:fe6d:7d08/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:29 errors:0 dropped:0 overruns:0 frame:0
TX packets:223 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4612 (4.5 KiB) TX bytes:31746 (31.0 KiB)
Then I've created a new Mail service with these local resources:
Ip address 10.0.181.3
Script /etc/rc.d/init.d/MailScanner
GFS file system on a SAN
Service start, but the problem is that, when I stop the service, external ip address is removed from bond0.
Hi Fabio,
Could you please attach the following files:
/etc/sysconfig/network-scripts/ifcfg-bond0 /etc/sysconfig/network-scripts/ifcfg-bond1 /etc/sysconfig/network-scripts/ifcfg-eth0 /etc/sysconfig/network-scripts/ifcfg-eth1 /etc/sysconfig/network-scripts/ifcfg-eth2 /etc/sysconfig/network-scripts/ifcfg-eth3 /etc/cluster/cluster.conf
And "external ip address is removed from bond0." - I assume here external IP is 10.0.181.41, right?
Thanks Gowrishankar Rajaiyan | A Linux Fanatic. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Hi Gowrishankar,
requested files attached; you well understand, I mean ip 10.0.181.41 disappear ( below the output from ifconfig after I tried to stop the service)
bond0 Link encap:Ethernet HWaddr 00:21:5A:48:DA:BE inet6 addr: fe80::221:5aff:fe48:dabe/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:52958 errors:0 dropped:0 overruns:0 frame:0 TX packets:7844 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:4915061 (4.6 MiB) TX bytes:4936239 (4.7 MiB)
Tks
Fabio
Fabio Macchi wrote:
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of A Linux Fanatic Sent: martedì 23 dicembre 2008 06:08 To: CentOS mailing list Subject: Re: [CentOS] cluster - ip address lost when service stopped
Fabio Macchi wrote:
Hi all,
I'm trying to setup a cluster of 2 machines with Centos 5.2 to host a postfix+spamassassin+clamav+mailscanner service. Below cluster software versions
rgmanager.i386 2.0.38-2.el5_2.1 installed
cman.i386 2.0.84-2.el5_2.2 installed
Every machine (hp blade server ) has 4 interfaces, bounded in this way:
Eth0, eth1 -> bond0 -> connection for public service ( 10.0.181.x )
Eth2,eth3 -> bond1 -> connection for intra-cluster communication ( 192.168.44.x )
bond0 Link encap:Ethernet HWaddr 00:21:5A:48:DA:BE
inet addr:10.0.181.41 Bcast:10.0.181.255 Mask:255.255.255.0
inet6 addr: fe80::221:5aff:fe48:dabe/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:85 errors:0 dropped:0 overruns:0 frame:0
TX packets:86 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:12963 (12.6 KiB) TX bytes:9144 (8.9 KiB)
bond1 Link encap:Ethernet HWaddr 00:1F:29:6D:7D:08
inet addr:192.168.44.41 Bcast:192.168.44.255 Mask:255.255.255.0
inet6 addr: fe80::21f:29ff:fe6d:7d08/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:29 errors:0 dropped:0 overruns:0 frame:0
TX packets:223 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4612 (4.5 KiB) TX bytes:31746 (31.0 KiB)
Then I've created a new Mail service with these local resources:
Ip address 10.0.181.3
Script /etc/rc.d/init.d/MailScanner
GFS file system on a SAN
Service start, but the problem is that, when I stop the service, external ip address is removed from bond0.
Hi Fabio,
Could you please attach the following files:
/etc/sysconfig/network-scripts/ifcfg-bond0 /etc/sysconfig/network-scripts/ifcfg-bond1 /etc/sysconfig/network-scripts/ifcfg-eth0 /etc/sysconfig/network-scripts/ifcfg-eth1 /etc/sysconfig/network-scripts/ifcfg-eth2 /etc/sysconfig/network-scripts/ifcfg-eth3 /etc/cluster/cluster.conf
And "external ip address is removed from bond0." - I assume here external IP is 10.0.181.41, right?
Thanks Gowrishankar Rajaiyan | A Linux Fanatic. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Hi Gowrishankar,
requested files attached; you well understand, I mean ip 10.0.181.41 disappear ( below the output from ifconfig after I tried to stop the service)
bond0 Link encap:Ethernet HWaddr 00:21:5A:48:DA:BE inet6 addr: fe80::221:5aff:fe48:dabe/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:52958 errors:0 dropped:0 overruns:0 frame:0 TX packets:7844 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:4915061 (4.6 MiB) TX bytes:4936239 (4.7 MiB)
Tks
Fabio
Hi Fabio
First, issue the command:
$ rpm -qf /sbin/ifup
It will respond with a line of text starting with either "initscripts" or "sysconfig," followed by some numbers. This is the package that provides your network initialization scripts.
Next, to determine if your installation supports bonding, issue the command:
$ grep ifenslave /sbin/ifup
If this returns any matches, then your initscripts or sysconfig has support for bonding.
Ref: http://www.linuxfoundation.org/en/Net:Bonding
Try configuring ifcfg-bondX using the contents described in the above link.
Thanks Gowrishankar Rajaiyan | A Linux Fanatic.
Hi Gowrishankar,
this problem seems to be related to cluster, not to bonding: bonding is working correctly, anyway I've tryied a test removing bonding, and I experience the same problem directly on interface eth0.
This is my cluster.conf
<?xml version="1.0" ?> <cluster alias="cluster01" config_version="54" name="cluster01"> <fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="30"/> <clusternodes> <clusternode name="AREA041" nodeid="2" votes="1"> <fence/> </clusternode> <clusternode name="AREA042" nodeid="3" votes="1"> <fence/> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices/> <rm> <failoverdomains> <failoverdomain name="httpd failover domain" ordered="0" restricted="1"> <failoverdomainnode name="AREA041" priority="1"/> </failoverdomain> </failoverdomains> <resources> <ip address="10.0.181.3" monitor_link="1"/> </resources> <service autostart="0" domain="httpd failover domain" name="Apache" recovery="disable"> <script file="/etc/rc.d/init.d/httpd" name="script httpd"/> <ip ref="10.0.181.3"/> </service> <service autostart="0" domain="httpd failover domain" name="Service Mail" recovery="disable"> <script file="/etc/rc.d/init.d/MailScanner" name="MailScanner"/> <clusterfs device="/dev/DATI_MAIL/DATI_MAIL" force_unmount="1" fsid="5845" fstype="gfs2" mountpoint="/dati_mail" name="Share_dati_mail" options=""/> <ip address="10.0.181.4" monitor_link="1"/> </service> </rm> </cluster>
Many thanks
Fabio
________________________________ From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of A Linux Fanatic Sent: martedì 23 dicembre 2008 11:43 To: CentOS mailing list Subject: Re: [CentOS] cluster - ip address lost when service stopped
Fabio Macchi wrote:
-----Original Message-----
From: centos-bounces@centos.orgmailto:centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of A Linux Fanatic
Sent: martedì 23 dicembre 2008 06:08
To: CentOS mailing list
Subject: Re: [CentOS] cluster - ip address lost when service stopped
Fabio Macchi wrote:
Hi all,
I'm trying to setup a cluster of 2 machines with Centos 5.2 to host a
postfix+spamassassin+clamav+mailscanner service. Below cluster
software versions
rgmanager.i386 2.0.38-2.el5_2.1 installed
cman.i386 2.0.84-2.el5_2.2 installed
Every machine (hp blade server ) has 4 interfaces, bounded in this way:
Eth0, eth1 -> bond0 -> connection for public service ( 10.0.181.x )
Eth2,eth3 -> bond1 -> connection for intra-cluster communication (
192.168.44.x )
bond0 Link encap:Ethernet HWaddr 00:21:5A:48:DA:BE
inet addr:10.0.181.41 Bcast:10.0.181.255 Mask:255.255.255.0
inet6 addr: fe80::221:5aff:fe48:dabe/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:85 errors:0 dropped:0 overruns:0 frame:0
TX packets:86 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:12963 (12.6 KiB) TX bytes:9144 (8.9 KiB)
bond1 Link encap:Ethernet HWaddr 00:1F:29:6D:7D:08
inet addr:192.168.44.41 Bcast:192.168.44.255 Mask:255.255.255.0
inet6 addr: fe80::21f:29ff:fe6d:7d08/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:29 errors:0 dropped:0 overruns:0 frame:0
TX packets:223 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4612 (4.5 KiB) TX bytes:31746 (31.0 KiB)
Then I've created a new Mail service with these local resources:
- Ip address 10.0.181.3
- Script /etc/rc.d/init.d/MailScanner
- GFS file system on a SAN
Service start, but the problem is that, when I stop the service,
external ip address is removed from bond0.
Hi Fabio,
Could you please attach the following files:
/etc/sysconfig/network-scripts/ifcfg-bond0
/etc/sysconfig/network-scripts/ifcfg-bond1
/etc/sysconfig/network-scripts/ifcfg-eth0
/etc/sysconfig/network-scripts/ifcfg-eth1
/etc/sysconfig/network-scripts/ifcfg-eth2
/etc/sysconfig/network-scripts/ifcfg-eth3
/etc/cluster/cluster.conf
And "external ip address is removed from bond0." - I assume here
external IP is 10.0.181.41, right?
Thanks
Gowrishankar Rajaiyan | A Linux Fanatic.
_______________________________________________
CentOS mailing list
CentOS@centos.orgmailto:CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
Hi Gowrishankar,
requested files attached; you well understand, I mean ip 10.0.181.41 disappear ( below the output from ifconfig after I tried to stop the service)
bond0 Link encap:Ethernet HWaddr 00:21:5A:48:DA:BE
inet6 addr: fe80::221:5aff:fe48:dabe/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:52958 errors:0 dropped:0 overruns:0 frame:0
TX packets:7844 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4915061 (4.6 MiB) TX bytes:4936239 (4.7 MiB)
Tks
Fabio
Hi Fabio
First, issue the command:
$ rpm -qf /sbin/ifup
It will respond with a line of text starting with either "initscripts" or "sysconfig," followed by some numbers. This is the package that provides your network initialization scripts.
Next, to determine if your installation supports bonding, issue the command:
$ grep ifenslave /sbin/ifup
If this returns any matches, then your initscripts or sysconfig has support for bonding.
Ref: http://www.linuxfoundation.org/en/Net:Bonding
Try configuring ifcfg-bondX using the contents described in the above link.
Thanks Rajaiyan | A Linux Fanatic.
Fabio Macchi wrote:
Hi Gowrishankar,
this problem seems to be related to cluster, not to bonding: bonding is working correctly, anyway I’ve tryied a test removing bonding, and I experience the same problem directly on interface eth0.
This is my cluster.conf
<?xml version="1.0" ?>
<cluster alias="cluster01" config_version="54" name="cluster01"> <fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="30"/> <clusternodes> <clusternode name="AREA041" nodeid="2" votes="1"> <fence/> </clusternode> <clusternode name="AREA042" nodeid="3" votes="1"> <fence/> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices/> <rm> <failoverdomains> <failoverdomain name="httpd failover domain" ordered="0" restricted="1"> <failoverdomainnode name="AREA041" priority="1"/> </failoverdomain> </failoverdomains> <resources> <ip address="10.0.181.3" monitor_link="1"/> </resources> <service autostart="0" domain="httpd failover domain" name="Apache" recovery="disable"> <script file="/etc/rc.d/init.d/httpd" name="script httpd"/> <ip ref="10.0.181.3"/> </service> <service autostart="0" domain="httpd failover domain" name="Service Mail" recovery="disable"> <script file="/etc/rc.d/init.d/MailScanner" name="MailScanner"/> <clusterfs device="/dev/DATI_MAIL/DATI_MAIL" force_unmount="1" fsid="5845" fstype="gfs2" mountpoint="/dati_mail" name="Share_dati_mail" options=""/> <ip address="10.0.181.4" monitor_link="1"/> </service> </rm> </cluster>
Many thanks
From what I can tell, the behaviour you are noticing is consistent with your cluster.conf file. Since you have made the IP addresses part of the service definitions, the IP would "go away" when the associated service is stopped. If the service moved to another node, however, the IP would be enabled on the host to which the service was moved.
If you want the IP addresses to be independent of the service state, then add them using files in /etc/sysconfig/network-scripts to define the "alias" addresses and remove them from your service definitions. See /usr/share/doc/initscripts-*/sysconfig.txt for details on how to set up the alias addresses.
I'm doing something similar with IP addresses in a cluster, but I WANT the IP address to migrate to the target host when a service is moved from one node in the cluster to another. I have the IP address resources tied to the individual services to make that happen.
Hope that helps!
Jay,
thanks for your answer, but there was a little misunderstanding: the ip 10.0.181.4 is associated to the service and it's correct it goes away when service is stopped ( this is what I want, exactly like your environment ). My problem is that ip 10.0.181.41, that was the original ip associated to interface bound0 of the node AREA041, goes away too when service is stopped, and I don't understand why. I suppose this is some issue related to fencing, as I don't use any fence device but I assign GFS resource internally to the service ( GFS share is mounted only when service start and only on the node hosting this service ): do you think this is a correct design ?
Tks Fabio
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Jay Leafey Sent: lunedì 29 dicembre 2008 21:41 To: CentOS mailing list Subject: Re: [CentOS] cluster - ip address lost when service stopped
Fabio Macchi wrote:
Hi Gowrishankar,
this problem seems to be related to cluster, not to bonding: bonding is working correctly, anyway I've tryied a test removing bonding, and I experience the same problem directly on interface eth0.
This is my cluster.conf
<?xml version="1.0" ?>
<cluster alias="cluster01" config_version="54" name="cluster01"> <fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="30"/> <clusternodes> <clusternode name="AREA041" nodeid="2" votes="1"> <fence/> </clusternode> <clusternode name="AREA042" nodeid="3" votes="1"> <fence/> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices/> <rm> <failoverdomains> <failoverdomain name="httpd failover domain" ordered="0" restricted="1"> <failoverdomainnode name="AREA041" priority="1"/> </failoverdomain> </failoverdomains> <resources> <ip address="10.0.181.3" monitor_link="1"/> </resources> <service autostart="0" domain="httpd failover domain" name="Apache" recovery="disable"> <script file="/etc/rc.d/init.d/httpd" name="script httpd"/> <ip ref="10.0.181.3"/> </service> <service autostart="0" domain="httpd failover domain" name="Service Mail" recovery="disable"> <script file="/etc/rc.d/init.d/MailScanner" name="MailScanner"/> <clusterfs device="/dev/DATI_MAIL/DATI_MAIL" force_unmount="1" fsid="5845" fstype="gfs2" mountpoint="/dati_mail" name="Share_dati_mail" options=""/> <ip address="10.0.181.4" monitor_link="1"/> </service> </rm> </cluster>
Many thanks
From what I can tell, the behaviour you are noticing is consistent with your cluster.conf file. Since you have made the IP addresses part of the service definitions, the IP would "go away" when the associated service is stopped. If the service moved to another node, however, the IP would be enabled on the host to which the service was moved.
If you want the IP addresses to be independent of the service state, then add them using files in /etc/sysconfig/network-scripts to define the "alias" addresses and remove them from your service definitions. See /usr/share/doc/initscripts-*/sysconfig.txt for details on how to set up the alias addresses.
I'm doing something similar with IP addresses in a cluster, but I WANT the IP address to migrate to the target host when a service is moved from one node in the cluster to another. I have the IP address resources tied to the individual services to make that happen.
Hope that helps!