[CentOS] clustering and load balancing Apache

Wed Feb 11 15:52:06 UTC 2009
J Potter <jpotter-centos at codepuppy.com>

Look at pound: http://www.apsis.ch/pound/

If you are concerned about traffic volume, you might consider running  
squid as a transparent proxy in front of pound. I.e.:

request -> squid -> pound -> apache

Where squid will return the response for everything marked as  
cacheable and still fresh; and pound will take care of load balancing  
to apache. (Pound can inspect/insert cookies to send visitors to the  
same back-end node on subsequent requests.) On some of our setups,  
squid responds to 98% of the requests coming in, and is able to  
respond to an extremely insane high volume of requests. Other list  
users might be able to provide good stats as to what sort of volume  
they can support. (I'd be curious to hear what others have seen...)

For HA:
	- 2 instances of squid, active/standby or active/active (i.e. two IP  
address in DNS for the public hostname, and have each squid instance  
pick up the others during failure).
	- 2 instances of pound, active/standby
	- N instances of apache

Re: replication of content on your apache nodes, another poster  
suggested drbd. From my understanding, I do not think this is  
possible, since only one node can mount the drbd volume at a time. If  
you have shared data that needs to be seen across apache nodes, either  
stick it in SQL or mount an NFS volume across the nodes. (But then you  
have NFS in the picture, which might not be so good.)

If your apache code is constant, then have a master apache node and  
write a shell script that runs rsync to push code changes out to the  
other instances.

It's hard to get very specific about what's best for your setup  
without know the specifics of things like the data sync needs on the  
apache nodes, so take all of this with a grain of salt -- or as a  
default starting place.

best,
Jeff