Linux HA may not be the best choice in your situation. [CentOS]High Availability using 2 sites

Thu Jan 5 21:03:48 UTC 2006
Todd Reed <treed at astate.edu>

Ok...I see some of your points, but if SITE_A talked to SITE_B via the
internet and they use a 1:1NAT, if the internet goes down at SITE_A, it
breaks the 1:1 NAT.  Servers down are different than the Internet
connection being down.

I never said RRD should be used for failover.  I said it could be in
combination with my idea.

I guess it would help to know if the web services are serving only the
company, or are they serving the public/Internet?

--Todd

-----Original Message-----
From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
Behalf Of Bryan J. Smith
Sent: Thursday, January 05, 2006 2:55 PM
To: CentOS mailing list
Subject: RE: Linux HA may not be the best choice in your situation.
[CentOS]High Availability using 2 sites

Todd Reed <treed at astate.edu> wrote:
> As for Brian's Message:
> 1)  You are right, to a point. 

To what point?  Round-robin DNS is not, and never will be,
failover.  And even some logic we've discussed here is rather
subjective and arbitrary, even for one, specific app --
working on a corporate network (before considering across the
Internet).

> I'm not saying that my idea was a replacement for BGP.  I
even
> state that by saying "This is no substitution for BGP".

Again, I wish I would have _never_ said BGP.  I mean an AS.

> 2)  You are correct in saying that you need to change in
how
> the world sees you (whether that be in the application or
> layer3 routing).  That can be done at the application layer
or
> at layer 3.  But, if you are a small company that has only
1
> internet link at your primary site and 1 internet
connection at
> your remote site and you rely on the internet for 
> communications between the sites, then you technically do
> not qualify for an AS number.  That makes BGP useless. 
> That makes your solution be somewhere in the upper layer of
> the OSI model.

That's why I gave a _second_ recommendation.

If you can guarantee your borders will _at_least_ be up, even
if the servers behind it are down, you can implement 1-to-1
NAT at the border.  I.e., if site A's servers are down, site
A's border can use 1-to-1 NAT to target site B's servers.

Please recognize I'm giving not just the "high end" solution,
but I'm also giving a "feasible" solution for SMBs too.  ;->

[ SIDE NOTE:  I was _not_ the person who brough up Google
either.  But when someone did, I (as well as at least 1
other) showed that I wasn't off-the-mark on how Google does
it either.  ;-]

> 3)  My idea was more on the lines of internet failure at
the
> primary site and using RRDNS on top of that.  If the
secondary
> DNS server kicked in and pulled it's configs from the
hidden
> master, then the hidden record wouldn't be configured for
RRDNS
> because it is only used when the primary site fails.

The problem is _still_ propogation.

That's why, in the absence of your own AS, you need the
failed site to redirect all traffic under the guise of 1-to-1
NAT to the site that is up.  It's very simple to do, and they
even have affordable devices to do so with Linux+ASICs (i.e.,
faster than a host-based Linux solution).

> 4)  My idea was also an application layer solution.
> Last time I checked, BGP was in Layer 3.

Forget I even mentioned ASNs for a moment.  I _also_
mentioned using 1-to-1 NAT between sites, and it works _well_
too.

It not only _avoids_ the propogation issue, but better yet,
it work work _while_ propogation is still occuring.

> 5)  Yes, there are some delays when using DNS.  Likewise
> there are delays with BGP.  Certainly BGP delays are a lot
> quicker.  No argument there.

Sigh, you're not getting my point at all on layer-3.  You've
totally missed it.  It's not comparable to application-level.
 It's absolute.

> But again, my idea is looking at it from an application
layer
> POV.

_Eventually_ you'd have to change the application-layer as
well, _if_ the site was down awhile.  But in the meantime,
you _need_ to do layer-3 redirection for the immediate
failover.

1-to-1 NAT does this.  It's very simple.  It just works.

-- 
Bryan J. Smith     Professional, Technical Annoyance
b.j.smith at ieee.org      http://thebs413.blogspot.com
----------------------------------------------------
*** Speed doesn't kill, difference in speed does ***
_______________________________________________
CentOS mailing list
CentOS at centos.org
http://lists.centos.org/mailman/listinfo/centos