[CentOS] Configuring source-specific routing

Thu May 2 18:48:26 UTC 2013
Michael Mol <mikemol at gmail.com>

On 05/02/2013 02:02 PM, Les Mikesell wrote:
> On Thu, May 2, 2013 at 12:31 PM, Michael Mol <mikemol at gmail.com> wrote:
>>>> with its default gateway pointing toward the ISP handling it.   DNS
>>> service is simple enough to have standalone servers for each instance
>>> you need.
>>
>> This would also require either resources or underlying authorizations I
>> don't have.
> 
> CentOS VMs are really, really cheap....

That's really, truly, seriously not the issue. I don't know if you saw
where I said I was setting up a private cloud.

And, as I said, I can't discuss the problem without breach of NDA.

> 
>>> Web browsers are actually very good at handling multiple IPs in DNS
>>> responses and doing their own failover if some of the IPs don't
>>> respond.
>>
>> It varies greatly by client software. And given the explosion of
>> unreliable network connections (wifi, mobile), some of that failover
>> logic's margin is already lost in dropped packets between the client and
>> their local network gateway.
> 
> Yes, but typically they can deal with receiving multple IPs from the
> initial DNS lookup even if some are broken better/faster than getting
> one IP which subsequently breaks and then having to do another DNS
> lookup to get a working target.   At least the few broswers I tested a
> while back did...

You missed my point, my point was that your margin is already eaten into
by unreliable networks.

> 
>>> For other services you might need to actively change DNS to drop IPs
>>> if you know they have become unreachable, though.
>>
>> Yup. That's what I was planning on doing, more or less. Start with
>> ordering IPs by route preference, drop IPs by link state. I just wish I
>> could drive it by snooping OSPF...
> 
> I don't think you can count on your ordering reaching the clients or
> meaning anything to them if it does.  And some applications won't ever
> do a lookup again.

Yes, intermediate resolvers may reorder responses. That's fine and
pretty normal. If ordering responses doesn't work, I fall back to a
stochastic approach; that's actually rather a "given", since an
oversaturated link qualifies as "down" for the purpose of new connections.

And, yes, there's a lot of client software out there (*especially web
browsers*) which cache responses and disregard TTLs. To those users, I
really can only say "have you tried turning it off and back on again?"

But here we are, arguing about *load balancing*, when the problem I face
is, frankly, one of taking either of a pair of *known-to-work* sequences
of invocations of "ip" commands and getting whatever process
/etc/sysconf/network-scripts/{ifcfg-eth*,ifcfg-route*} to maneuver the
kernel into the same resulting state.

Source-based routing frankly isn't that hard! From the perspective of an
edge node (i.e. a server):

# First subnet
ip addr add 10.0.0.2/24 dev eth0 brd 10.1.0.255
ip route add default via 10.0.0.1 dev eth0 src 10.0.0.2

# Second subnet
ip addr add 10.1.0.2/24 dev eth0 brd 10.1.0.255
ip route add default via 10.1.0.1 dev eth0 src 10.1.0.2

and from a router's perspective, it's

# Assuming proxy_arp is set on eth0 and eth1
# Sets up source-specific routing for 10.0.0.0/24
# WAN hangs off eth0. LAN hangs off eth1.
ip addr add 10.0.0.2/24 dev eth1 brd 10.0.0.255 # To LAN
ip addr add 10.0.0.2 dev eth0 # For the benefit of 'src 10.0.0.2' below
ip route add 10.0.0.1 dev eth0 src 10.0.0.2 # For 'via 10.0.0.1' below
ip route add default via 10.0.0.1 dev eth0 src 10.0.0.2 from 10.0.0.0/24

# Assuming proxy_arp is set on eth0 and eth1
# Sets up source-specific routing for 10.1.0.0/24
# WAN hangs off eth0. LAN hangs off eth1.
ip addr add 10.1.0.2 dev eth1 brd 10.1.0.255 # To LAN
ip addr add 10.1.0.2 dev eth0 # For the benefit of 'src 10.1.0.2' below
ip route add 10.1.0.1 dev eth0 src 10.1.0.2 # For 'via 10.1.0.1' below
ip route add default via 10.1.0.1 dev eth0 src 10.1.0.2 from 10.1.0.0/24

That's it! (unless I typo'd or thinko'd something coming up with these
examples.) It took me all of three or four hours yesterday to learn this
much of it. Then the rest of the day discovering the stuff I was putting
in route-ethN wasn't being honored.

My problem has been that the "from 10.x.0.0/24" parameter keeps getting
stripped by whatever processes /etc/sysconfig/network-scripts/route-ethN


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 555 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos/attachments/20130502/c8ab0e2d/attachment-0005.sig>