On 05/02/2013 02:02 PM, Les Mikesell wrote: > On Thu, May 2, 2013 at 12:31 PM, Michael Mol <mikemol at gmail.com> wrote: >>>> with its default gateway pointing toward the ISP handling it. DNS >>> service is simple enough to have standalone servers for each instance >>> you need. >> >> This would also require either resources or underlying authorizations I >> don't have. > > CentOS VMs are really, really cheap.... That's really, truly, seriously not the issue. I don't know if you saw where I said I was setting up a private cloud. And, as I said, I can't discuss the problem without breach of NDA. > >>> Web browsers are actually very good at handling multiple IPs in DNS >>> responses and doing their own failover if some of the IPs don't >>> respond. >> >> It varies greatly by client software. And given the explosion of >> unreliable network connections (wifi, mobile), some of that failover >> logic's margin is already lost in dropped packets between the client and >> their local network gateway. > > Yes, but typically they can deal with receiving multple IPs from the > initial DNS lookup even if some are broken better/faster than getting > one IP which subsequently breaks and then having to do another DNS > lookup to get a working target. At least the few broswers I tested a > while back did... You missed my point, my point was that your margin is already eaten into by unreliable networks. > >>> For other services you might need to actively change DNS to drop IPs >>> if you know they have become unreachable, though. >> >> Yup. That's what I was planning on doing, more or less. Start with >> ordering IPs by route preference, drop IPs by link state. I just wish I >> could drive it by snooping OSPF... > > I don't think you can count on your ordering reaching the clients or > meaning anything to them if it does. And some applications won't ever > do a lookup again. Yes, intermediate resolvers may reorder responses. That's fine and pretty normal. If ordering responses doesn't work, I fall back to a stochastic approach; that's actually rather a "given", since an oversaturated link qualifies as "down" for the purpose of new connections. And, yes, there's a lot of client software out there (*especially web browsers*) which cache responses and disregard TTLs. To those users, I really can only say "have you tried turning it off and back on again?" But here we are, arguing about *load balancing*, when the problem I face is, frankly, one of taking either of a pair of *known-to-work* sequences of invocations of "ip" commands and getting whatever process /etc/sysconf/network-scripts/{ifcfg-eth*,ifcfg-route*} to maneuver the kernel into the same resulting state. Source-based routing frankly isn't that hard! From the perspective of an edge node (i.e. a server): # First subnet ip addr add 10.0.0.2/24 dev eth0 brd 10.1.0.255 ip route add default via 10.0.0.1 dev eth0 src 10.0.0.2 # Second subnet ip addr add 10.1.0.2/24 dev eth0 brd 10.1.0.255 ip route add default via 10.1.0.1 dev eth0 src 10.1.0.2 and from a router's perspective, it's # Assuming proxy_arp is set on eth0 and eth1 # Sets up source-specific routing for 10.0.0.0/24 # WAN hangs off eth0. LAN hangs off eth1. ip addr add 10.0.0.2/24 dev eth1 brd 10.0.0.255 # To LAN ip addr add 10.0.0.2 dev eth0 # For the benefit of 'src 10.0.0.2' below ip route add 10.0.0.1 dev eth0 src 10.0.0.2 # For 'via 10.0.0.1' below ip route add default via 10.0.0.1 dev eth0 src 10.0.0.2 from 10.0.0.0/24 # Assuming proxy_arp is set on eth0 and eth1 # Sets up source-specific routing for 10.1.0.0/24 # WAN hangs off eth0. LAN hangs off eth1. ip addr add 10.1.0.2 dev eth1 brd 10.1.0.255 # To LAN ip addr add 10.1.0.2 dev eth0 # For the benefit of 'src 10.1.0.2' below ip route add 10.1.0.1 dev eth0 src 10.1.0.2 # For 'via 10.1.0.1' below ip route add default via 10.1.0.1 dev eth0 src 10.1.0.2 from 10.1.0.0/24 That's it! (unless I typo'd or thinko'd something coming up with these examples.) It took me all of three or four hours yesterday to learn this much of it. Then the rest of the day discovering the stuff I was putting in route-ethN wasn't being honored. My problem has been that the "from 10.x.0.0/24" parameter keeps getting stripped by whatever processes /etc/sysconfig/network-scripts/route-ethN -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 555 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20130502/c8ab0e2d/attachment-0005.sig>