Just to clarify, I'm looking at this from an application layer Point of View. One of the reasons why I'm looking at it that way, is because Tim said he was looking at LinuxHA..."application level" redundancy that uses IP.
Tim, just to let you know, I don't believe that LinuxHA will work in the way you described, only because of the different IP ranges. It looks like Linux HA wants to take the main machines IP. With them being on 2 separate networks, that wouldn't work. I haven't looked, but you may want to look at some sort of load balancing solution that you can "weight" which server get's the majority of the request. But, this would require clients hitting the load balancer at one of the facilities, therefore defeating your purpose of redundancy...if they can't hit the load balancer, they can't hit the secondary site!
As for Brian's Message:
1) You are right, to a point. I'm not saying that my idea was a replacement for BGP. I even state that by saying "This is no substitution for BGP".
2) You are correct in saying that you need to change in how the world sees you (whether that be in the application or layer3 routing). That can be done at the application layer or at layer 3. But, if you are a small company that has only 1 internet link at your primary site and 1 internet connection at your remote site and you rely on the internet for communications between the sites, then you technically do not qualify for an AS number. That makes BGP useless. That makes your solution be somewhere in the upper layer of the OSI model.
3) My idea was more on the lines of internet failure at the primary site and using RRDNS on top of that. If the secondary DNS server kicked in and pulled it's configs from the hidden master, then the hidden record wouldn't be configured for RRDNS because it is only used when the primary site fails.
4) My idea was also an application layer solution. Last time I checked, BGP was in Layer 3.
5) Yes, there are some delays when using DNS. Likewise there are delays with BGP. Certainly BGP delays are a lot quicker. No argument there. But again, my idea is looking at it from an application layer POV.
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Bryan J. Smith Sent: Thursday, January 05, 2006 11:34 AM To: CentOS mailing list Subject: RE: [CentOS] High Availability using 2 sites
Todd Reed treed@astate.edu wrote:
I agree, BGP is important to route the IP's, but I've been exploring this same option with a different thought.
I guess you missed my point. It's _not_ just a matter of using BGP for your dynamic routing. It's a matter of getting an assigned, autonomous system number so the Internet addressing your multiple networks as the same network.
[ There's a lot more to the Internet than just IPs ;-]
That's the proper way to do it.
I'd like to hear what others say about this!
I also made the suggestion to enable 1-to-1 NAT at each facility. Should the servers on one site go down, your 1-to-1 NAT devices would redirect requests to servers at the other site.
That doesn't require an additional, "external" registration/administration. Of course it means packets are now routed to your first site first, then your second site, so if the first site is wiped out (with no equipment), that doesn't help you.
Here is my plan (although not implemented or tested) for Web Services. At our main data center we have the primary DNS server and our primary web server. The remote location houses the secondary DNS server and our secondary web server. Also at that second location is "hidden" master DNS server. .. cut ... That is the theory in a nutshell. I've read that this is possible, but I haven't had a chance to test it.
The problem with the theory is that names are cached all over the Internet. That's why DNS server/name changes don't do squat when it comes to failover.
Now you could _consider_ setting a very low time-to-live (TTL) on your servers -- like 5 minutes. But that doesn't always work either.
What do others think about this? This is no substitution for BGP, but for those that don't run BGP or need to
re-route
the IP networks, this may work.
Again, it's _more_ than just BGP. ;->
You have to modify how the Internet sees you. Not just what you provide to the Internet. ;->
That's a key distinction that most people don't consider.
Todd Reed treed@astate.edu wrote:
As for Brian's Message:
- You are right, to a point.
To what point? Round-robin DNS is not, and never will be, failover. And even some logic we've discussed here is rather subjective and arbitrary, even for one, specific app -- working on a corporate network (before considering across the Internet).
I'm not saying that my idea was a replacement for BGP. I
even
state that by saying "This is no substitution for BGP".
Again, I wish I would have _never_ said BGP. I mean an AS.
- You are correct in saying that you need to change in
how
the world sees you (whether that be in the application or layer3 routing). That can be done at the application layer
or
at layer 3. But, if you are a small company that has only
1
internet link at your primary site and 1 internet
connection at
your remote site and you rely on the internet for communications between the sites, then you technically do not qualify for an AS number. That makes BGP useless. That makes your solution be somewhere in the upper layer of the OSI model.
That's why I gave a _second_ recommendation.
If you can guarantee your borders will _at_least_ be up, even if the servers behind it are down, you can implement 1-to-1 NAT at the border. I.e., if site A's servers are down, site A's border can use 1-to-1 NAT to target site B's servers.
Please recognize I'm giving not just the "high end" solution, but I'm also giving a "feasible" solution for SMBs too. ;->
[ SIDE NOTE: I was _not_ the person who brough up Google either. But when someone did, I (as well as at least 1 other) showed that I wasn't off-the-mark on how Google does it either. ;-]
- My idea was more on the lines of internet failure at
the
primary site and using RRDNS on top of that. If the
secondary
DNS server kicked in and pulled it's configs from the
hidden
master, then the hidden record wouldn't be configured for
RRDNS
because it is only used when the primary site fails.
The problem is _still_ propogation.
That's why, in the absence of your own AS, you need the failed site to redirect all traffic under the guise of 1-to-1 NAT to the site that is up. It's very simple to do, and they even have affordable devices to do so with Linux+ASICs (i.e., faster than a host-based Linux solution).
- My idea was also an application layer solution.
Last time I checked, BGP was in Layer 3.
Forget I even mentioned ASNs for a moment. I _also_ mentioned using 1-to-1 NAT between sites, and it works _well_ too.
It not only _avoids_ the propogation issue, but better yet, it work work _while_ propogation is still occuring.
- Yes, there are some delays when using DNS. Likewise
there are delays with BGP. Certainly BGP delays are a lot quicker. No argument there.
Sigh, you're not getting my point at all on layer-3. You've totally missed it. It's not comparable to application-level. It's absolute.
But again, my idea is looking at it from an application
layer
POV.
_Eventually_ you'd have to change the application-layer as well, _if_ the site was down awhile. But in the meantime, you _need_ to do layer-3 redirection for the immediate failover.
1-to-1 NAT does this. It's very simple. It just works.