[CentOS-virt] Xen domU default gateway missing/ARP table full

Wed Nov 18 16:39:24 UTC 2009
Ken Bass <kbass at kenbass.com>

I have been trying to figure out why my domU NIC becomes unreachable 
(could not even ping) at various times. (Normally when the server was 
trying to update clamav from the various busy mirrors at 4am). There 
also seemed to be some latency when connecting which I chalked up to it 
being a virtual machine.

When I checked my logs, I found thousands of :
Nov 17 04:07:52 nomad kernel: Neighbour table overflow.
and applications reporting errors such as:
Nov 17 04:08:05 nomad freshclam[4085]: nonblock_connect: connect(): fd=5 
errno=105: No buffer space available

I am running a routed (not bridged) configuration.

What I figured out is that each Centos 5.4 domU is maintaining an ARP 
table. That table is filling up which causes the network to be 
unreachable until entries are purged from the cache. Since this is a 
routed configuration, the ARP table should really only consist of two or 
three entries, my domU, my dom0, and the gateway.

It appears the networking-scripts until Centos are ignoring the GATEWAY 
entry. I end up with route of:
169.254.0.0     *               255.255.0.0     U     0      0        0 eth0
default         *               0.0.0.0         U     0      0        0 eth0

The default route should be the specific IP address in my 
/etc/sysconfig/network file. When I manually add the route, the arp 
table issue
is fixed. The network stack no longer trys to query an arp entry for 
every IP address.

I found this bug at Xen which was closed as INVALID saying 'Centos is 
broken'. That was from 2006.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=596

Any ideas on what is broken and what the correct fix is? Right now, I 
just added

/sbin/route add -net 0.0.0.0 netmask 0.0.0.0 gw x.x.x.x

to my /etc/rc.local which seems like a hack solution.