On Thu, Mar 3, 2011 at 6:43 PM, Todd <slackmoehrle.lists at gmail.com> wrote: > Hi All, > Can anyone help me hash out how best to load balance a website that is > getting considerable traffic? In the past I only have experience with BigIP > where you have a load balancing device that keeps track and send traffic to > the best server possible at the time. This was a proprietary system that I > think was something Dell rebranded. > Right now, the whole site is is 400gb of video, HTML5, Apache, PHP, MySQL, > runs on a single box with 16gb of RAM and mirrored /var/www/html (2x1tb raid > level drives). I have a Comcast 50/10 connection, 5 statics and I am seeing > about 125 unique visitors a day. The site runs fine, but in anticipation of > more traffic as well as a learning experience I would like to load balance. > Obviously I need a second server just like the one it is running on now. I > will probably spec something out that is capable of 32gb of RAM. > What about a dedicated load balancing device? What specs should this be? How > much RAM, HD, processor? It is sufficient to buy something with a GB NIC and > say 4gb of RAM? Can one go slower but more RAM, small HD? I don't really > quite know how intensive a task this decision making process is for the load > balancer.. > Right now, as example, I have an Untangle Firewall and it runs on a old AMD > with 2gb RAM, GB NIC and it seems to do just fine. > My local computer store has several P4 2.8ghz with 2GB of RAM for like > $99.... > Can anyone enlighten me on specs, proper setup, caveats....? > -Jason You have a lot of issues here, and some unanswered questions. Is the load on your site mostly bandwidth use? Do you have users who need to login to a system? Is the application designed to run with multiple front-ends? It's easy to get very basic load balancing, but your app most likely will require "sticky sessions" to ensure the user goes to the same backend server every time, and many solutions don't have this feature. Of the free options already listed, here are the problems with them: - Round Robin DNS: Provides no additional features other then very poor "load spreading" across servers. As soon as you talk about load balancing there are usually features you need that this cannot provide, like automatic failover, dynamic adding/removing hosts, etc... Sticky sessions are simply not possible. RR DNS should not be used except in extremely basic situations. - Linux LVS: This is a good idea on the face of it, but it can open up some tricky issues with routing and IP address handling. Also, sticky sessions are based on subnet of the IP address, which for many corporations using proxies will not work. I have seen companies that spread their proxy load across multiple /8 networks, so there's no way to sticky them. OK, so what's good? For my requirements, HAProxy is excellent. It handled sticky sessions well, performs monitoring of each host, allows dynamic adding/removing of servers, as well as maintenance modes. It's very easy to install and configure. I'm using is as the backend to apache that is acting as an SSL termination point. It's been very high performing for us and I know a lot of big sites use it as well. The only question I would have with it is handling of video, as we only use it for typical web traffic, just high bandwidth stuff like that. Also, make sure any load balancer you have is redundant and has some kind of failover, using something like pacemaker, heartbeat, etc...