[CentOS] how to increase DNS reliability?

Thu Jul 25 17:11:43 UTC 2019
hw <hw at gc-24.de>

On 7/25/19 3:28 PM, Leroy Tennison wrote:
> If you don't want multiple DNS server entries on the client

I'm ok with them, only the problem is that the clients take their timeouts
when a server is unreachable, and users panic.

> then a master and (possibly multiple) slave server configuration can be set up (I'm assuming ISC DNS - their solution to redundancy/failover is master and slave servers, this may be the way it is with all DNS).

Yes, bind9, and I've set up a master and a slave.  The router uses them to
forward requests to on behalf of those clients that use the router as a name
server while other clients know master and slave but not the router as name

There was a failure a while ago (IIRC because of a UPS causing a server to
shut down when the battery failed the self test), and things didn't quite work
anymore with the master server being unreachable.

This is how I have a problem with the clients knowing multiple servers: The
very setup is intended to keep things working during an outage and yet it
doesn't help.

>  keepalived can be used for fail over and will present a single IP address (which the clients would use) shared among the servers.  haproxy or alternatives might be another fail over option.

Thanks, I'll look into that!  I've been searching for "dns proxy" and no useful
results came up ...

> Each technology has its own learning curve (and doing this will require at least two) and caveats.  In particular systemd doesn't appear to play well with technologies creating IP addresses it doesn't manage.  The version of keepalived we're using also has its own nasty quirk as well where it comes up assuming it is master until discovered otherwise, this is true even if it is configured as backup.  In most cases this is probably either a non-issue (no scripts being used) or a minor annoyance.  But if you're using scripts trigger
>  ed by keepalived which make significant (and possibly conflicting) changes to the environment then you'll need to embed "intelligence" in them to wait until final state is reached or test state before acting or some other option.

I consider myself warned :)