clustering and load balancing Apache

List overview All Threads
Download

newer

older

procmail 3.22 on CentOS 5.2

Re: [CentOS] tinydns/djbdns...

Anto Marky

10 Feb 2009 10 Feb '09

7:57 a.m.

Hi, I am new to clustering and loadbalancing in apache, What is best way of doing it? How do I do the clustering and what tools do I need to use? Do I have those tools, I use CentOS , Do i have any tools in CenOs which comes default in it? And how do I do apache load balancing? should I rely on apache forums or mailing list or is there any way or tool I can use in CentOS? Can any throw some vague Idea on how to do it so that I start reading documents before I do it?

Thanks and regards Marky

Attachments:

attachment.html (text/html — 526 bytes)

Show replies by date

Fajar Priyanto

10 Feb 10 Feb

8:05 a.m.

On Tue, Feb 10, 2009 at 2:57 PM, Anto Marky markycentos@gmail.com wrote:

...

Hi, I am new to clustering and loadbalancing in apache, What is best way of doing it? How do I do the clustering and what tools do I need to use? Do I have those tools, I use CentOS , Do i have any tools in CenOs which comes default in it? And how do I do apache load balancing? should I rely on apache forums or mailing list or is there any way or tool I can use in CentOS? Can any throw some vague Idea on how to do it so that I start reading documents before I do it?

This is a good start to give you some overview: http://www.ibm.com/developerworks/linux/library/l-linux-ha/index.html

David Hrbáč

8:20 a.m.

Fajar Priyanto napsal(a):

...

This is a good start to give you some overview: http://www.ibm.com/developerworks/linux/library/l-linux-ha/index.html

Then, you can go here: http://code.google.com/p/ath/ David Hrbáč

Anto Marky

8:33 a.m.

Hi,

Thanks for the link.

On Tue, Feb 10, 2009 at 12:50 PM, David Hrbáč hrbac.conf@seznam.cz wrote:

...

Fajar Priyanto napsal(a):

...
This is a good start to give you some overview: http://www.ibm.com/developerworks/linux/library/l-linux-ha/index.html

Then, you can go here: http://code.google.com/p/ath/ David Hrbáč _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Anto Marky

8:21 a.m.

Hi,

Thanks for the link.

On Tue, Feb 10, 2009 at 12:35 PM, Fajar Priyanto fajarpri@arinet.orgwrote:

...

On Tue, Feb 10, 2009 at 2:57 PM, Anto Marky markycentos@gmail.com wrote:

...
Hi, I am new to clustering and loadbalancing in apache, What is best way of doing it? How do I do the clustering and what tools do I need to use? Do

I

...
have those tools, I use CentOS , Do i have any tools in CenOs which comes default in it? And how do I do apache load balancing? should I rely on apache forums or mailing list or is there any way or tool I can use in CentOS? Can any throw some vague Idea on how to do it so that I start reading documents before I do it?

This is a good start to give you some overview: http://www.ibm.com/developerworks/linux/library/l-linux-ha/index.html _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Victor Padro

8:33 a.m.

On Tue, Feb 10, 2009 at 1:21 AM, Anto Marky markycentos@gmail.com wrote:

...

Hi,

Thanks for the link.

On Tue, Feb 10, 2009 at 12:35 PM, Fajar Priyanto fajarpri@arinet.orgwrote:

...
On Tue, Feb 10, 2009 at 2:57 PM, Anto Marky markycentos@gmail.com wrote:

...
Hi, I am new to clustering and loadbalancing in apache, What is best way of doing it? How do I do the clustering and what tools do I need to use? Do

I

...
have those tools, I use CentOS , Do i have any tools in CenOs which

comes

...
default in it? And how do I do apache load balancing? should I rely on apache forums or mailing list or is there any way or tool I can use in CentOS? Can any throw some vague Idea on how to do it so that I start reading documents before I do it?

This is a good start to give you some overview: http://www.ibm.com/developerworks/linux/library/l-linux-ha/index.html _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

You should find good start up info here: http://howtoforge.com/howtos/high-availability

Cheers,

-- "It is human nature to think wisely and act in an absurd fashion." "Todo el desorden del mundo proviene de las profesiones mal o mediocremente servidas"

Sergej Kandyla

11:08 a.m.

Anto Marky пишет:

...

Hi, I am new to clustering and loadbalancing in apache, What is best way of doing it? How do I do the clustering and what tools do I need to use? Do I have those tools, I use CentOS , Do i have any tools in CenOs which comes default in it? And how do I do apache load balancing? should I rely on apache forums or mailing list or is there any way or tool I can use in CentOS? Can any throw some vague Idea on how to do it so that I start reading documents before I do it?

Hi, apache is good as backend server for dynamic applications. You could use something like nginx, haproxy as frontend for balancing multiple backend servers. I'm using nginx. This light web server could serve many thousand concurrent connections! It works great!

look at http://wiki.codemongers.com/NginxLoadBalanceExample http://blog.kovyrin.net/2006/08/25/haproxy-load-balancer/lang/en/ http://blog.kovyrin.net/2006/05/18/nginx-as-reverse-proxy/lang/en/ and http://highscalability.com/

Another issue is keeping content synchronizing between apache servers. There are several solutions: NAS\SAN or programbased DRBD http://en.wikipedia.org/wiki/DRBD.

Rainer Duffner

11:31 a.m.

Sergej Kandyla schrieb:

...

Hi, apache is good as backend server for dynamic applications. You could use something like nginx, haproxy as frontend for balancing multiple backend servers. I'm using nginx. This light web server could serve many thousand concurrent connections! It works great!

look at http://wiki.codemongers.com/NginxLoadBalanceExample http://blog.kovyrin.net/2006/08/25/haproxy-load-balancer/lang/en/ http://blog.kovyrin.net/2006/05/18/nginx-as-reverse-proxy/lang/en/ and http://highscalability.com/

Yup. NGINX is probably the fastest way to serve content nowadays. But content has to be static and be available as a file (AFAIK) directly to NGINX.

There's also "varnish", if you can't meet the above provision easily.

...

Another issue is keeping content synchronizing between apache servers. There are several solutions: NAS\SAN or programbased DRBD http://en.wikipedia.org/wiki/DRBD.

Or GFS, if one is into this sort of stuff... But a NAS is much less complex to debug ;-)

Rainer

Sergej Kandyla

11 Feb 11 Feb

11:49 a.m.

New subject: clustering and load balancing Apache, using nginx

Rainer Duffner пишет:

...

Sergej Kandyla schrieb:

...
Hi, apache is good as backend server for dynamic applications. You could use something like nginx, haproxy as frontend for balancing multiple backend servers. I'm using nginx. This light web server could serve many thousand concurrent connections! It works great!

look at http://wiki.codemongers.com/NginxLoadBalanceExample http://blog.kovyrin.net/2006/08/25/haproxy-load-balancer/lang/en/ http://blog.kovyrin.net/2006/05/18/nginx-as-reverse-proxy/lang/en/ and http://highscalability.com/

Yup. NGINX is probably the fastest way to serve content nowadays. But content has to be static and be available as a file (AFAIK) directly to NGINX.

No, nginx could serve any kind of content via ngx_http_proxy_module module http://wiki.codemongers.com/NginxHttpProxyModule For example I'm using nginx as reverse proxy for tomcat servers\applications. Also I've wrote some article about using nginx in shared hosting sphere. Look at http://directadmin.com/forum/showthread.php?t=27344

When content located on the some server (or via NAS\SAN) nginx could serve this content directly using some efficient mechanisms like sendfile http://wiki.codemongers.com/NginxHttpCoreModule#sendfile

For serving static content nginx even more times efficient than ftp!! On some servers with low-power hardware like celeron\sempron processors and 512M ram I have upload rate nearly 100mbit, It's not limit for nginx, its a limit of sata disks and chanel to that servers :)

As for load-balancing: http://wiki.codemongers.com/NginxHttpUpstreamModule http://barry.wordpress.com/2008/04/28/load-balancer-update/

...

There's also "varnish", if you can't meet the above provision easily.

Les Mikesell

3:09 p.m.

New subject: clustering and load balancing Apache, using nginx

Sergej Kandyla wrote:

...

No, nginx could serve any kind of content via ngx_http_proxy_module module http://wiki.codemongers.com/NginxHttpProxyModule For example I'm using nginx as reverse proxy for tomcat servers\applications.

Is there some advantage to this over apache with mod_jk?

-- Les Mikesell lesmikesell@gmail.com

Sergej Kandyla

3:51 p.m.

New subject: clustering and load balancing Apache, using nginx

Les Mikesell пишет:

...

Sergej Kandyla wrote:

...
No, nginx could serve any kind of content via ngx_http_proxy_module module http://wiki.codemongers.com/NginxHttpProxyModule For example I'm using nginx as reverse proxy for tomcat servers\applications.

Is there some advantage to this over apache with mod_jk?

afaik mod_jk is only available for RHEL4\Centos4 i.e apache 2.0 (of course you could compile it manually for apache 2.2 coming with centos5) So, recommended way for centos5 (apache 2.2) is using mod_proxy (mod_proxy_ajp)

nginx http_proxy module is universal complex solution. Also apache working in prefork mode (in general cases), I don't know does mod_jk\mod_proxy_ajp works in the worker-MPM mode...

In the preforking mode apache create a child on each incoming request, so it's too much expensive for resource usage. Also apache spend about 15-30Kb mem for serving each tcp connection at this time nginx only 1-1.5Kb. If you have, for example, abount 100 concurrent connections from different IPs there is nearly 100 apache forks... it's too expensive.

If you don't need full power of apache flexibility as server for dynamic applications, why use it for simple job such as proxing ? So, I think nginx is great as light frontend server.

example config for proxing to tomcat backend:

location / { rewrite ^/$ /tomcatapp/ redirect; }

location /tomcatapp { proxy_pass http://localhost:8080/tomcatapp;

proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

proxy_connect_timeout 120; proxy_send_timeout 120; proxy_read_timeout 180;

}

Les Mikesell

6:44 p.m.

New subject: clustering and load balancing Apache, using nginx

Sergej Kandyla wrote:

...

nginx http_proxy module is universal complex solution. Also apache working in prefork mode (in general cases), I don't know does mod_jk\mod_proxy_ajp works in the worker-MPM mode...

In the preforking mode apache create a child on each incoming request, so it's too much expensive for resource usage.

Have you actually measured this? Preforking apache doesn't fork per request, it forks enough instances to accept the concurrent connection count plus a few spares. Each child would typically handle thousands of requests before exiting and requiring a new fork - the number is configurable.

...

Also apache spend about 15-30Kb mem for serving each tcp connection at this time nginx only 1-1.5Kb. If you have, for example, abount 100 concurrent connections from different IPs there is nearly 100 apache forks... it's too expensive.

A freshly forked child should have nearly 100% memory shared with its parent and other child instances. As things change, this will decrease, but you are going to have to store the unique socket/buffer info somewhere whether it is a copy-on-write fork or allocated in an event-loop program. If you run something like mod_perl, the shared memory effect degrades pretty quickly because of the way perl stores reference counts along with its variables, but I'd expect the base apache and most module code to be pretty good about retaining their inherited shared memory.

...

If you don't need full power of apache flexibility as server for dynamic applications, why use it for simple job such as proxing ? So, I think nginx is great as light frontend server.

It may be, but I'd like to see some real-world measurements. Most of the discussions about more efficient approaches seem to use straw-man arguments that aren't realistic about the way apache works or timings of a few static pages under ideal conditions that don't match an internet web server.

-- Les Mikesell lesmikesell@gmail.com

nate

7:24 p.m.

New subject: clustering and load balancing Apache, using nginx

Les Mikesell wrote:

...

It may be, but I'd like to see some real-world measurements. Most of the discussions about more efficient approaches seem to use straw-man arguments that aren't realistic about the way apache works or timings of a few static pages under ideal conditions that don't match an internet web server.

In my experience apache has not been any kind of noticeable bottleneck. At my last company we deployed a pair of apache reverse proxy nodes that did:

- reverse proxy(188 rewrite rules) - HTTP compression (compression level set to 9) - mod_expires for some static content that we hosted on the front end proxy nodes - SSL termination for the portion of the sites that needed SSL - Header manipulation (had to remove some headers to work around IE browser issues with SSL) - Serve up a "maintenance" page when we took the site down for software updates(this was on another dedicated apache instance)

traffic flow was:

internet->BigIP->proxy->BigIP->front end web servers->BigIP->back end apps (utilizing BigIP's ability to transparently/effortlessly NAT traffic internal to the network, and using HTTP headers to communicate the originating IP addresses from the outside world).

Each proxy node had 8 copies of apache going, 4 for HTTP and 4 for HTTPS, at the moment they seem to average about 125 workers per proxy node, and an average of 80 idle workers per node. CPU averages 3%, memory averages about 650MB(boxes have 3GB). When I first started at the company they were trying to do this via a low end F5 BigIP load balancer but it was not able to provide the same level of service at low latency(and that was when we had a dozen proxy rules). I love BigIPs but for proxies I prefer apache. It wasn't until recently that F5 made their code sudo multithreaded, until then even if you had a 4 CPU load balancer, the proxy stuff could only use one of those CPUs. Because of this limitation one large local customer F5 told me that they had to implement 5 layers of load balancers due to their app design depended on the full proxy support in the BigIPs to route traffic.

Systems were dual proc single core hyperthreaded. They proxied requests for four dual proc quad core systems which seem to average around 25-35% CPU usage and about 5GB of memory usage(8GB total) a piece.

At the company before that we had our stuff split out per customer, and had 3 proxy nodes in front and about 100 web servers and application servers behind them for the biggest customers, having 3 was just for N+1 redundancy, 1 was able to handle the job. And those proxies were single processor.

At my current job 99% of the load is served directly by tomcat, the application on the front end at least is simple by comparison so there's no need for rewrite-type rules. Load balancing is handled by F5 BigIPs, as is SSL termination. We don't do any HTTP compression as far as I know.

I personally would not want to load balance using apache, I load balance with BigIPs, and I do layer 7 proxying(URL inspection) with apache. If I need to do deeper layer 7 inspection then I may resort to F5 iRules, but the number of times I've had to do that over the past several years I think is maybe two. And even today with the latest version of code, our dual processor BigIPs cannot run in multithreaded mode, it's not supported on the platform, only on the latest & greatest(ours is one generation back from the latest).

I use apache because I've been using it for so long and know it so well, it's rock solid stable at least for me, and the fewer different platforms I can use reduces complexity and improves manageability for me.

If I was in a situation where apache couldn't scale to meet the needs and something else was there that could handle say 5x the load, then I might take a look. So far haven't come across that yet.

nate

Sergej Kandyla

12 Feb 12 Feb

12:03 p.m.

New subject: clustering and load balancing Apache, using nginx

Les Mikesell пишет:

...

Sergej Kandyla wrote:

...
nginx http_proxy module is universal complex solution. Also apache working in prefork mode (in general cases), I don't know does mod_jk\mod_proxy_ajp works in the worker-MPM mode...

In the preforking mode apache create a child on each incoming request, so it's too much expensive for resource usage.

Have you actually measured this? Preforking apache doesn't fork per request, it forks enough instances to accept the concurrent connection count plus a few spares. Each child would typically handle thousands of requests before exiting and requiring a new fork - the number is configurable.

Sorry for bad explanation. I meant that apache create a child (above MinSpareServers) for serving each new unique client.

I measured nginx in real life :) On some server (~15k uniq hosts per day, ~ 100k pageviews, and with 1-3k concurrent tcp "established" connections ) with frontend(nginx) - backend (apache + phpfastcgi) architecture I turned off nginx proxing and server go away for a minute... apache forked to MaxClients (500) and took all memory.

Also nginx helped me protect from low-medium DDoS. When apache forked to maxclients, nginx could server many thousand concurrent connections. So I've wrote shell scripts to parse nginx logs and put IPs of bots to firewall table.

Therefore I find nginx (lighttpd also a good choose) enough efficient (at least for me). Off course you should understand what you expecting from nginx, what it can do and what can't.

If you want real world measurements or examples of using nginx on heavy loaded sites please to google. Also you could ask in the nginx at sysoev.ru mail list (EN).

...

...
Also apache spend about 15-30Kb mem for serving each tcp connection at this time nginx only 1-1.5Kb. If you have, for example, abount 100 concurrent connections from different IPs there is nearly 100 apache forks... it's too expensive.

A freshly forked child should have nearly 100% memory shared with its parent and other child instances.

Please tell me how much resources you should have for revers proxing with apache for example nearly 1k-2k unique clients ? What cpu load and memory usage will you have?

I think that apache is great software. It's very flexible and features rich, but it especially good as backend for dynamical applications (mod_php, mod_perl, etc.) If you need to serve many thousand concurrent connections you should look at nginx, lighttpd, squid, etc.. IMHO.

http://www.kegel.com/c10k.html

...

As things change, this will decrease, but you are going to have to store the unique socket/buffer info somewhere whether it is a copy-on-write fork or allocated in an event-loop program. If you run something like mod_perl, the shared memory effect degrades pretty quickly because of the way perl stores reference counts along with its variables, but I'd expect the base apache and most module code to be pretty good about retaining their inherited shared memory.

Les Mikesell

6:04 p.m.

New subject: clustering and load balancing Apache, using nginx

Sergej Kandyla wrote:

...

...
...
In the preforking mode apache create a child on each incoming request, so it's too much expensive for resource usage.

Have you actually measured this? Preforking apache doesn't fork per request, it forks enough instances to accept the concurrent connection count plus a few spares. Each child would typically handle thousands of requests before exiting and requiring a new fork - the number is configurable.

Sorry for bad explanation. I meant that apache create a child (above MinSpareServers) for serving each new unique client.

That's actually for each concurrent connection, not each unique client. Browsers may fire off many simultaneous connections but http connections typically have a very short life, so unless users are downloading big files, streaming data, or have low-bandwidth connections (or your back end service is slow), you shouldn't have that much concurrency.

...

I measured nginx in real life :) On some server (~15k uniq hosts per day, ~ 100k pageviews, and with 1-3k concurrent tcp "established" connections ) with frontend(nginx) - backend (apache + phpfastcgi) architecture I turned off nginx proxing and server go away for a minute... apache forked to MaxClients (500) and took all memory.

There are many factors that can affect it, but that seems like too many concurrent connections for that amount of traffic. The obvious thing to check is whether you have keepalives on and if so, what timeout you use. On a busy internet site you want it off or very short. Also, I'm not sure the fastcgi interface gives the same buffer/decoupling effect that you get with a proxy. With a proxy, the heavyweight backend is finished and can accept the next request as soon as it has sent its output to the proxy which may take much longer to deliver to slow clients. The fastcgi interface might keep the backend tied up until the output is delivered. If that is the case, you would get much of the same effect with apache as a front end proxy. Running apache as a proxy might work with less memory in threaded mode too.

...

Also nginx helped me protect from low-medium DDoS. When apache forked to maxclients, nginx could server many thousand concurrent connections. So I've wrote shell scripts to parse nginx logs and put IPs of bots to firewall table.

Basically if your backend can't deliver the data at the rate the requests come in you are fried anyway.

...

Therefore I find nginx (lighttpd also a good choose) enough efficient (at least for me). Off course you should understand what you expecting from nginx, what it can do and what can't.

If you want real world measurements or examples of using nginx on heavy loaded sites please to google. Also you could ask in the nginx at sysoev.ru mail list (EN).

Thanks, I hadn't found much about it in english.

...

...
...
Also apache spend about 15-30Kb mem for serving each tcp connection at this time nginx only 1-1.5Kb. If you have, for example, abount 100 concurrent connections from different IPs there is nearly 100 apache forks... it's too expensive.

A freshly forked child should have nearly 100% memory shared with its parent and other child instances.

Please tell me how much resources you should have for revers proxing with apache for example nearly 1k-2k unique clients ? What cpu load and memory usage will you have?

I'm not sure there are good ways to measure the shared copy-on-write RAM of forked processes. But 15k/connection doesn't sound unreasonable, keeping in mind that you have to buffer all unacknowledged data somewhere.

...

I think that apache is great software. It's very flexible and features rich, but it especially good as backend for dynamical applications (mod_php, mod_perl, etc.) If you need to serve many thousand concurrent connections you should look at nginx, lighttpd, squid, etc.. IMHO.

I've been using F5 load balancers for the hard part of this for a while but I'd still wonder why you have that much concurrency instead of delivering the page and dropping the connection.

-- Les Mikesell lesmikesell@gmail.com

Florin Andrei

10 Feb 10 Feb

8:31 p.m.

Sergej Kandyla wrote:

...

apache is good as backend server for dynamic applications. You could use something like nginx, haproxy as frontend for balancing multiple backend servers. I'm using nginx. This light web server could serve many thousand concurrent connections! It works great!

In addition to the user-space solutions mentioned above, there are also kernel-level solutions, such as Linux Virtual Server, or LVS:

http://www.linuxvirtualserver.org/

I am under the impression that, speaking in general, user-space balancers provide more features (are smarter), while the kernel-space ones are faster (provide more in terms of raw speed and max load). I could be wrong.

Can anybody provide a performance comparison between, say, nginx and LVS? (max connections, max new connections rate, max bandwidth, max packets per second, etc.)

-- Florin Andrei http://florin.myip.org/

Sergej Kandyla

11 Feb 11 Feb

12:13 p.m.

Florin Andrei пишет:

...

Sergej Kandyla wrote:

...
apache is good as backend server for dynamic applications. You could use something like nginx, haproxy as frontend for balancing multiple backend servers. I'm using nginx. This light web server could serve many thousand concurrent connections! It works great!

In addition to the user-space solutions mentioned above, there are also kernel-level solutions, such as Linux Virtual Server, or LVS:

http://www.linuxvirtualserver.org/

IMHO it's not right compare light web server with Virtual servers.

Look at http://www.linuxvirtualserver.org/whatis.html In this scheme you could naturally use nginx as loadbalancer on the Load Balancer Linux Box.

Also "The mission of the project is to build a high-performance and highly available server for Linux using clustering http://en.wikipedia.org/wiki/Computer_cluster technology, which provides good scalability, reliability and serviceability."

If you need high-availability you could also use XEN\KVM or OpenVZ. These technologies are actively developing... XEN\KVM are supported natively in the RHEL\Centos kernel. I'm prefer OpenVZ as light-weight virtualization. http://wiki.openvz.org/HA_cluster_with_DRBD_and_Heartbeat

...

I am under the impression that, speaking in general, user-space balancers provide more features (are smarter), while the kernel-space ones are faster (provide more in terms of raw speed and max load). I could be wrong.

Can anybody provide a performance comparison between, say, nginx and LVS? (max connections, max new connections rate, max bandwidth, max packets per second, etc.)

J Potter

4:52 p.m.

Look at pound: http://www.apsis.ch/pound/

If you are concerned about traffic volume, you might consider running squid as a transparent proxy in front of pound. I.e.:

request -> squid -> pound -> apache

Where squid will return the response for everything marked as cacheable and still fresh; and pound will take care of load balancing to apache. (Pound can inspect/insert cookies to send visitors to the same back-end node on subsequent requests.) On some of our setups, squid responds to 98% of the requests coming in, and is able to respond to an extremely insane high volume of requests. Other list users might be able to provide good stats as to what sort of volume they can support. (I'd be curious to hear what others have seen...)

For HA: - 2 instances of squid, active/standby or active/active (i.e. two IP address in DNS for the public hostname, and have each squid instance pick up the others during failure). - 2 instances of pound, active/standby - N instances of apache

Re: replication of content on your apache nodes, another poster suggested drbd. From my understanding, I do not think this is possible, since only one node can mount the drbd volume at a time. If you have shared data that needs to be seen across apache nodes, either stick it in SQL or mount an NFS volume across the nodes. (But then you have NFS in the picture, which might not be so good.)

If your apache code is constant, then have a master apache node and write a shell script that runs rsync to push code changes out to the other instances.

It's hard to get very specific about what's best for your setup without know the specifics of things like the data sync needs on the apache nodes, so take all of this with a grain of salt -- or as a default starting place.

best, Jeff

Florin Andrei

8:50 p.m.

J Potter wrote:

...

It's hard to get very specific about what's best for your setup without know the specifics of things like the data sync needs on the apache nodes, so take all of this with a grain of salt -- or as a default starting place.

I did not ask anything related to my setup. I already use a couple different load balancing technologies.

I was just curious about performance comparisons between different types of load balancers in general.

-- Florin Andrei http://florin.myip.org/

Jure Pečar

8:55 p.m.

On Wed, 11 Feb 2009 11:50:34 -0800 Florin Andrei florin@andrei.myip.org wrote:

...

I was just curious about performance comparisons between different types of load balancers in general.

It's hard to say ... you usualy use load balancers to achieve higher availability and put as little as possible in the way of traffic when you want performance (save for the most expensive hw load balancers).

For Apache, I had great success with mod_backhand, available at www.backhand.org iirc. It's one of the smartest balancers, but only available for apache 1.3. I've heard 1.3 is still faster than 2.x in many cases.

But I'm nginx only now for a few years now ;)

-- Jure Pečar http://jure.pecar.org

Anto Marky

13 Feb 13 Feb

6:32 a.m.

Thanks for your reply

On Wed, Feb 11, 2009 at 9:22 PM, J Potter jpotter-centos@codepuppy.comwrote:

...

Look at pound: http://www.apsis.ch/pound/

If you are concerned about traffic volume, you might consider running squid as a transparent proxy in front of pound. I.e.:

request -> squid -> pound -> apache

Where squid will return the response for everything marked as cacheable and still fresh; and pound will take care of load balancing to apache. (Pound can inspect/insert cookies to send visitors to the same back-end node on subsequent requests.) On some of our setups, squid responds to 98% of the requests coming in, and is able to respond to an extremely insane high volume of requests. Other list users might be able to provide good stats as to what sort of volume they can support. (I'd be curious to hear what others have seen...)

For HA: - 2 instances of squid, active/standby or active/active (i.e. two IP address in DNS for the public hostname, and have each squid instance pick up the others during failure). - 2 instances of pound, active/standby - N instances of apache

Re: replication of content on your apache nodes, another poster suggested drbd. From my understanding, I do not think this is possible, since only one node can mount the drbd volume at a time. If you have shared data that needs to be seen across apache nodes, either stick it in SQL or mount an NFS volume across the nodes. (But then you have NFS in the picture, which might not be so good.)

If your apache code is constant, then have a master apache node and write a shell script that runs rsync to push code changes out to the other instances.

It's hard to get very specific about what's best for your setup without know the specifics of things like the data sync needs on the apache nodes, so take all of this with a grain of salt -- or as a default starting place.

best, Jeff _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Anto Marky

11 Feb 11 Feb

9:47 a.m.

Hi,

Thanks for your reply,

If I have my content in a centralised system like amazon s3, will I have problem syncronizing?

Thanks and Regards Marky

On Tue, Feb 10, 2009 at 3:38 PM, Sergej Kandyla sk.paix@gmail.com wrote:

...

Anto Marky пишет:

...
Hi, I am new to clustering and loadbalancing in apache, What is best way of doing it? How do I do the clustering and what tools do I need to use? Do I have those tools, I use CentOS , Do i have any tools in CenOs which comes default in it? And how do I do apache load balancing? should I rely on apache forums or mailing list or is there any way or tool I can use in CentOS? Can any throw some vague Idea on how to do it so that I start reading documents before I do it?

Hi, apache is good as backend server for dynamic applications. You could use something like nginx, haproxy as frontend for balancing multiple backend servers. I'm using nginx. This light web server could serve many thousand concurrent connections! It works great!

look at http://wiki.codemongers.com/NginxLoadBalanceExample http://blog.kovyrin.net/2006/08/25/haproxy-load-balancer/lang/en/ http://blog.kovyrin.net/2006/05/18/nginx-as-reverse-proxy/lang/en/ and http://highscalability.com/

Another issue is keeping content synchronizing between apache servers. There are several solutions: NAS\SAN or programbased DRBD http://en.wikipedia.org/wiki/DRBD.

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

John R Pierce

9:52 a.m.

Anto Marky wrote:

...

If I have my content in a centralised system like amazon s3, will I have problem syncronizing?

s3 is an example of a DE-centralized distributed cloud system.

by the simple fact that you're asking such a vague and generic question, I'd hazard to guess, yes, you will have problems with synchronization whatever it is you're doing, depending on your expectations and experience, of course.

Rainer Duffner

9:58 a.m.

John R Pierce schrieb:

...

Anto Marky wrote:

...
If I have my content in a centralised system like amazon s3, will I have problem syncronizing?

s3 is an example of a DE-centralized distributed cloud system.

by the simple fact that you're asking such a vague and generic question, I'd hazard to guess, yes, you will have problems with synchronization whatever it is you're doing, depending on your expectations and experience, of course.

Hm. He could try reverse-proxying his content locally ;-)

Rainer

6153

Age (days ago)

6156

Last active (days ago)

discuss@lists.centos.org

23 comments

12 participants

tags (0)

participants (12)

Anto Marky
David Hrbáč
Fajar Priyanto
Florin Andrei
J Potter
John R Pierce
Jure Pečar
Les Mikesell
nate
Rainer Duffner
Sergej Kandyla
Victor Padro