[CentOS] High Availability using 2 sites

Thu Jan 5 19:59:17 UTC 2006
Les Mikesell <lesmikesell at gmail.com>

On Thu, 2006-01-05 at 13:01, Bryan J. Smith wrote:
> Les Mikesell <lesmikesell at gmail.com> wrote:
> > The 'round-robin' concept just means that the server will
> > rotate the order of the addresses in the answer.  All
> > addresses are still visible to the client and in the
> caches.
> > Try 'nslookup www.ibm.com'  to see the effect of multiple
> > A records for the same name. 
> 
> Yes, I know how it works.  What I'm saying is that I don't
> think the Windows resolver, before they even get to MS IE,
> works as you believe.  At least not in an Internet
> environment.  The Windows resolver is very, very different
> than most UNIX resolvers, including a "hold down" for not
> just failed resolution, but failed acces.

You are missing the point. If you put multiple A records
for the same name in DNS, all clients will see them
all the time whether they work or not and whether anything
caches them or not.

> > IE will try them all.  Try setting up multiple A records
> > in your DNS with one pointing to a working web server and
> > one not and see if you even notice a difference when
> > connecting to that name.
> 
> Furthermore, I made the addition point that I think you're
> crossing some attributes of DNS with those of ActiveDirectory
> Server (ADS) integrated DNS.

The DNS server is irrelevant here.  Any server should be
able to serve multiple A records for a name, all the time.

> This is the Windows Resolver at work, not so much MS IE,
> although the integration for ADS-integrated DNS and
> ADS-integrated application like MS IE, do some interesting
> things very _differently_ and _separate_ from how the Windows
> resolver works for _Internet_ addresses.  ;->

And the resolver is irrelevant as well.  Any client should
be able to see the list of addresses, all the time.

> > No, I mean multiple A records.
> 
> But on what server?
> 
> A true BIND or similar DNS server or Windows DNS Server?

It doesn't matter.  Assign multiple A records, you get
a list of IP addresses as the answer.  

> > Most apps are dumb and only try the first one in the list
> > returned so the round robin rotation on the server side
> gives
> > statistical load balancing but apps other than web browsers
> > tend to fail if the first address doesn't respond.
> 
> I think you're crossing some concepts that MS IE doesn't
> handle, but the Windows resolver does.  And then there are
> ADS considerations as well.

No, there are issues with the stock connect() routine.

> > F5 uses a 30 second TTL by default on responses that can
> > change dynamically.  It works well enough through normal
> > caches but apps normally keep their first answer until
> > you restart them.
> 
> But there is a lot of arbitrary cache/resolution between
> their authority and your end-usage.  That's always going to
> be an issue.

In the dynamic scenario, you have a possible problem of
cache admins configuring to use a minimum time of their
own choice rather than following the spec, but that is
rare.  And it doesn't affect an unchanging list.

> > On the contrary, the app is the best place to deal with it
> > if you can.  That is, always return all possible IP
> > addresses in the DNS query (or at least all working sites)
> > and let the app walk through the list until it gets a
> > connection that works.
> 
> Again, arbitrary and you can not only _not_ trust the apps to
> work that way, but worse yet, there's a lot of
> cache/resolution between you, the authority, and the end
> system.

If you write the app you can trust it to work the way you
wrote it and you don't have to worry about anyone's cache.
That why I suggest doing it that way.  Always give out multiple
IP addresses and don't change DNS.  Write the app to walk
the list of returned addresses itself if the first one it
tries doesn't respond.  This seems to already be done in
the common web browsers.

> IP address is the only guarantee.  That's why people get AS
> numbers.  You have to appear to be a single point from the
> standpoint of the Internet, even if you're getting your
> connections from 2-3 different providers.

Not really.  If you can't control the app you might have
to live with this.  Otherwise you can give out several
IP addresses for a name and let the app decide which one
is reachable from it's location.

> > I have quite a bit of experience with this and that
> approach
> > is even better than trying to juggle DNS dynamically except
> > for the case where you want to force clients to one
> location
> > or the other.  For example, you might temporarily have
> local
> > routing problems at some location that make it impossible
> to
> > connect to one site or the other that no other test could
> > detect, and if the app has both IP addresses it can still
> get
> > to the one that works.
> 
> Yes, that works when _you_ can _guarantee_ that all clients
> will talk _directly_ to the authority, or control intermedia
> cache/non-authorities that guarantee adherence to the TTL. 
> That's why it works for intranets as well as Internet
> networks _you_ control.

Not true for the case of supplying multiple A records that
don't change.  The DNS servers/resolvers may change the
order of the list but nothing else.

> But everything changes when you have people who don't access
> the authority of the domain.  And to rely on an application
> is rather arbitrary, especially how I've seen both the
> Windows resolver and MS IE act.

If you can find a repeatable case where IE does the wrong
thing with multiple A records where some work and some don't
please let me know.  I don't claim to understand how it works
but it seems very robust in those circumstances.

> > However, it only works for web apps and ones where
> > you write the client yourself. The standard library
> > 'connect' library routines will try one address and give
> up.
> 
> Yes, which is why you can't trust it.
> Even if you do write it, you're making the assumptions.
> What if the service is not acting like you assume?

How can DNS not work according to the specifications at
least at the 'A' record level?

> DNS does not provide what it seems from the standpoint of
> different utilities (let alone versions), and Microsoft's
> ADS-integrated works very, very different to make matters
> worse.

It doesn't matter.  Any dns server should be able to take
multiple A records for one name and any dns client should
get a list of addresses as the response.  The client app just
needs to know enough to try more than the first one in
the list.  Actually I think some versions of windows will
try to figure out which to try from their route table but
that doesn't seem very predictable.

-- 
  Les Mikesell
    lesmikesell at gmail.com