[CentOS] High Availability using 2 sites

Thu Jan 5 20:21:40 UTC 2006
Bryan J. Smith <thebs413 at earthlink.net>

Les Mikesell <lesmikesell at gmail.com> wrote:
> You are missing the point.

It's very clear both you and I are talking about 2 entirely
different things.  I don't disagree with many of the concepts
you are covering, I know how round robin DNS works.  But how
these concept work with respect to high availability is what
I'm taking major issue with.

> The DNS server is irrelevant here.

It's _very_relevant_ if MS-RPC calls are being used and
resolution changes from standard DNS at the _client_!  That
was my point!

> In the dynamic scenario, you have a possible problem of
> cache admins configuring to use a minimum time of their
> own choice rather than following the spec, but that is
> rare.  And it doesn't affect an unchanging list.

Sigh, you're picking and choosing the context you wish to
discuss.  When you're providing server failover, you can't
rely on applications or DNS, but you must make the IP appear
as the same.

On one site, that is doable with NAT -- be it 1-to-1 or
destination, with additional considerations.  Across sites
you have to get far more involved.  If, of course, assumes
you're using stateless sessions (like HTTP), and changes
radically (and NAT won't work) if you are using stateful
sessions (like RPC, NFS, etc...).

You are _not_ going to address it with DNS.  It might work
for you if you can guarantee all systems hit the true
authority, like you can on a LAN or corporate intranet.  It
might also work if you're using an extended DNS server that
uses alternative services -- as as how ADS and MS IE
interoperate with each other (yes, even when it "seems"
you're using "stnadard DNS" you're actually not).

> If you write the app you can trust it to work the way you
> wrote it and you don't have to worry about anyone's cache.
> That why I suggest doing it that way.  Always give out
> multiple IP addresses and don't change DNS.  Write the app
> walk the list of returned addresses itself if the first one
> tries doesn't respond.

We're talking about web services spread across 2 sites.
What the heck does this context have anything to do with it?

> This seems to already be done in the common web browsers.

Not the logic you're presenting, no.  I think you're
mega-oversimplifying things, and have the Windows resolver/MS
IE logic _wrong_ on DNS -- other than the basics of how round
robin works.

> Not really.  If you can't control the app you might have
> to live with this.

Is that _not_ the context of this _entire_ thread?

> Not true for the case of supplying multiple A records that
> don't change.  The DNS servers/resolvers may change the
> order of the list but nothing else.

Again, you're continuing to make the assumption on the
applications used, and that they magically handle this logic
as you want them to arbitrarily do so.

> If you can find a repeatable case where IE does the wrong
> thing with multiple A records where some work and some
> don't please let me know.  I don't claim to understand how
> works but it seems very robust in those circumstances.

And I would differ on that assessment, very much so.

I often have to hack the Windows registry just to get MS IE
to work correctly for corporate intranets, much less the
Internet (with far more variables).

> How can DNS not work according to the specifications at
> least at the 'A' record level?

Sigh, I'm not opening up that can of worms (don't get me
started ;-).

I also think you're referring extended operations of ADS, and
not DNS, with MS IE.  When you think you're just doing simple
DNS resolution, there are MS-RPC calls being made if you have
ADS for your DNS and MS IE for your client.

> Actually I think some versions of windows will
> try to figure out which to try from their route table but
> that doesn't seem very predictable.

Just about everything you have discussed here has been rather
"arbitrary" and not very well understood.

As I mentioned before, I purposely have to hack the Windows
registry (typically pushed via GPOs) just to get MS IE to
stop doing so really stupid things on an intranet.  I
seriously doubt it works so perfectly as you describe over
the Internet with its resolution -- quite the opposite.

The "hold downs" on various things are my biggest issue. 
Especially when it comes to non-availability.

Bryan J. Smith     Professional, Technical Annoyance                      b.j.smith at ieee.org      http://thebs413.blogspot.com
*** Speed doesn't kill, difference in speed does ***