[CentOS] bonding

Thu Apr 12 12:45:28 UTC 2007
Scott McClanahan <scott.mcclanahan at trnswrks.com>

I four nodes setup to do active-backup bonding and the drivers loaded
for the bonded network interfaces vary between tg3 and e100.  All
interfaces with the e100 driver loaded report errors much like what you
see here:

bonding: bond0: link status definitely down for interface eth2,
disabling it
e100: eth2: e100_watchdog: link up, 100Mbps, full-duplex
bonding: bond0: link status definitely up for interface eth2.

This happens all day on every node.  I have configured the bonding
module to do MII link monitoring at a frequency of 100 milliseconds and
it is using basic carrier link detection to test if the interface is
alive or not.  There was no custom building of any modules on these
nodes and the o/s is CentOS 4.3.

Some more relevant information is below (this display is consistent
across all nodes):

[smccl at tf35 ~]$uname -srvmpio
Linux 2.6.9-34.ELhugemem #1 SMP Wed Mar 8 00:47:12 CST 2006 i686 i686
i386 GNU/Linux

[smccl at tf35 ~]$head -5 /etc/modprobe.conf
alias bond0 bonding
options bonding miimon=100 mode=1
alias eth0 tg3
alias eth1 tg3
alias eth2 e100

[smccl at tf35 ~]$cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v2.6.1 (October 29, 2004)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:10:18:0c:86:a4

Slave Interface: eth2
MII Status: up
Link Failure Count: 12
Permanent HW addr: 00:02:55:ac:a2:ea

Any idea why these e100 links report failures so often?  They are
directly plugged into a Cisco Catalyst 4506.  Thanks.