[CentOS] heartbeat configuration for lb

Tue Dec 14 18:51:19 UTC 2010
bluethundr <bluethundr at gmail.com>

hey guys thanks for the tip.. I haven't had a chance to play with
heartbeat as we decided to go with keepalived as per Emmet's
suggestion. It works beautifully with two keepalived/haproxy load
balancers. I really appreciate emmet's advice and sorry I didn't let
you know it was working sooner.

 At any rate, I was told to add a 3rd load balancer to the mix and
that adds a new wrinkle. I need to add a new keepalived instance and I
can't quite figure out how that's done.

 It would seem to me to be an issue of the priorities as that is the
only thing I altered in the files.  Initially, nodes A and B were set
to 101 and 100 respectively. I set node A to 102, node B to 101 and
node C to 100... keepalived restarts and the virtual IP is pingable.
But the website goes down! :(

 SO then I tried Node A set to 101, node B to 100 and node C to 99.
Same thing, I restated keepalived and the site goes down, tho the
virtual IP remains pinagble and keepalived and haproxy are running.

Does anyone know how to address this issue?

Thanks!!

On Mon, Dec 13, 2010 at 3:04 AM, Juergen Gotteswinter <jg at internetx.de> wrote:
> Not 100% On Topic, but perhaps you should try keepalived for vrrp
> failover on Loadbalancers. Much more reliable, easier to setup and
> faster switch to the standby host
>
> keepalived.org
>
> Am 13.12.10 04:50, schrieb Emmett Culley:
>> On 12/11/2010 07:26 PM, bluethundr wrote:
>>> Sorry I forgot to finish the story!!! :)
>>>
>>> And the interface doesn't appear to be sharing the address:
>>>
>>> [root at VIRTCENT01:~]#ip addr sh eth0
>>> 2: eth0:<BROADCAST,MULTICAST,UP,LOWER_UP>   mtu 1500 qdisc pfifo_fast qlen 1000
>>>       link/ether 00:16:36:22:92:70 brd ff:ff:ff:ff:ff:ff
>>>       inet 192.168.1.23/24 brd 192.168.1.255 scope global eth0
>>>       inet6 fe80::216:36ff:fe22:9270/64 scope link
>>>          valid_lft forever preferred_lft forever
>>>
>>>
>>> And I can't ping the virtual address I had tried to setup using heartbeat:
>>>
>>> [root at VIRTCENT01:~]#ping 192.168.1.200
>>> PING 192.168.1.200 (192.168.1.200) 56(84) bytes of data.
>>>>  From 192.168.1.23 icmp_seq=1 Destination Host Unreachable
>>>>  From 192.168.1.23 icmp_seq=2 Destination Host Unreachable
>>>>  From 192.168.1.23 icmp_seq=3 Destination Host Unreachable
>>>
>>> thanks again!!!
>>>
>>>
>>>
>>> On Sat, Dec 11, 2010 at 10:13 PM, bluethundr<bluethundr at gmail.com>   wrote:
>>>> hello list!
>>>>
>>>>    I am attempting to setup haproxy using a shared up I am trying to
>>>> setup using the heartbeat package that I currently have installed:
>>>>
>>>>    [root at VIRTCENT01:~]#rpm -qa | grep heartbeat | grep -v -e stonith -e pils
>>>> heartbeat-2.1.4-11.el5
>>>> heartbeat-2.1.4-11.el5
>>>>
>>>>
>>>> I have /etc/ha/.d authkeys setup this way:
>>>>
>>>> #
>>>> auth 2
>>>> #1 crc
>>>> 2 sha1 {SHA}secret
>>>>
>>>> I have /etc/ha.d/resources setup like this:
>>>>
>>>> VIRTCENT01.summitnjhome.com 192.168.1.23
>>>>
>>>> And I have /etc/ha.cf setup like this:
>>>>
>>>>    #       What UDP port to use for udp or ppp-udp communication?
>>>> #
>>>> udpport        694
>>>> bcast  eth0
>>>> mcast eth0 225.0.0.1 694 1 0
>>>> ucast eth0 192.168.1.200
>>>> #       What interfaces to heartbeat over?
>>>> udp     eth0
>>>> #
>>>> #       Facility to use for syslog()/logger (alternative to log/debugfile)
>>>> #
>>>> logfacility     local0
>>>> #
>>>> #       Tell what machines are in the cluster
>>>> #       node    nodename ...    -- must match uname -n
>>>> node    lb1.summitnjhome.com
>>>> node    lb2.summitnjhome.com
>>>>
>>>>
>>>> The service seems to start ok:
>>>>
>>>> [root at VIRTCENT01:~]#service heartbeat restart
>>>> Stopping High-Availability services:
>>>>                                                             [  OK  ]
>>>> Waiting to allow resource takeover to complete:
>>>>                                                             [  OK  ]
>>>> Starting High-Availability services:
>>>> 2010/12/11_22:03:55 INFO:  Resource is stopped
>>>>                                                             [  OK  ]
>>>>
>>>> (tho I am unsure of that the INFO notice is of the resource being stopped).
>>>>
>>>> And I have verified that it is running with ps:
>>>>
>>>> [root at VIRTCENT01:~]#ps auxwww | grep heartbeat
>>>> root      3646  0.1  4.6  12260 12256 ?        SLs  22:03   0:00
>>>> heartbeat: master control process
>>>> nobody    3648  0.0  2.1   5664  5660 ?        SL   22:03   0:00
>>>> heartbeat: FIFO reader
>>>> nobody    3649  0.0  2.1   5660  5656 ?        SL   22:03   0:00
>>>> heartbeat: write: bcast eth0
>>>> nobody    3650  0.0  2.1   5660  5656 ?        SL   22:03   0:00
>>>> heartbeat: read: bcast eth0
>>>> root      3653  0.0  0.2  61180   736 pts/1    S+   22:04   0:00 grep heartbeat
>>>>
>>>>
>>>> And verified that the box is listening on port 694 (the port that I
>>>> have set for heartbeat):
>>>>
>>>>
>>>> [root at VIRTCENT01:~]#netstat -tulpn | grep heartbeat
>>>> udp        0      0 0.0.0.0:694                 0.0.0.0:*
>>>>                   3649/heartbeat: wri
>>>> udp        0      0 0.0.0.0:50550               0.0.0.0:*
>>>>                   3649/heartbeat: wri
>>>>
>>>> However although I have the port enabled in iptables:
>>>>
>>>> -A RH-Firewall-1-INPUT -m state --state NEW -m udp -p udp --dport 694 -j ACCEPT
>>>> -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
>>>> COMMIT
>>>>
>>>>
>>>> An nmap scan does not see anything active on 694:
>>>>
>>>> bluethundr at bluethundr-laptop:~$ sudo nmap -sT -A virt1
>>>>
>>>> Starting Nmap 5.00 ( http://nmap.org ) at 2010-12-11 22:07 EST
>>>> Warning: Traceroute does not support idle or connect scan, disabling...
>>>> Interesting ports on 192.168.1.23:
>>>> Not shown: 997 filtered ports
>>>> PORT    STATE  SERVICE VERSION
>>>> 22/tcp  open   ssh     OpenSSH 5.6 (protocol 2.0)
>>>> |  ssh-hostkey: 1024 b0:gu:s (DSA)
>>>> |_ 2048 b0:gu:s (RSA)
>>>> 80/tcp  closed http
>>>> 631/tcp closed ipp
>>>> MAC Address: 00:16:36:22:92:70 (Quanta Computer)
>>>> Device type: general purpose
>>>> Running: Linux 2.6.X
>>>> OS details: Linux 2.6.15 - 2.6.26
>>>> Network Distance: 1 hop
>>>>
>>>> OS and Service detection performed. Please report any incorrect
>>>> results at http://nmap.org/submit/ .
>>>> Nmap done: 1 IP address (1 host up) scanned in 11.27 seconds
>>>>
>>>>
>>>>
>>>> I am enclosing an archive of my /etc/ha.d directory in case this is of
>>>> use to anyone. I would certainly appreciate any help anyone could
>>>> provide!
>>>>
>>>> Thanks!!
>>>>
>>>>
>>>> --
>>>> GPG me!!
>>>>
>>>> gpg --keyserver pgp.mit.edu --recv-keys F186197B
>>>>
>>>
>>>
>>>
>> The message you are seeing when you start heartbeat doesn't make any sense to me either, but it does indicate that it started correctly.
>>
>> The line:
>>
>> VIRTCENT01.summitnjhome.com 192.168.1.23
>>
>> should be:
>>
>> VIRTCENT01.summitnjhome.com 192.168.1.200
>>
>> To cause that IP address to be available upon taking control.
>>
>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>



-- 
GPG me!!

gpg --keyserver pgp.mit.edu --recv-keys F186197B