On Thursday, July 08, 2010 09:40 PM, JohnS wrote:
On Thu, 2010-07-08 at 07:51 -0500, Les Mikesell wrote:
I think some bridge or vlan scenarios require promiscuous mode (and the corresponding disabling of hardware acceleration). Maybe the real issue is that something accidentally disabled it and you now only work when tcpdump re-enables it. I'm not sure how this is supposed to be managed atomically when multiple programs may manipulate it and it needs to be propagated across multiple bonded nics, but maybe something went wrong there. At least some things log the change so maybe you can get a hint about when it was turned on and off.
Check out /proc/net/bonding/bond/YOUR_BOND. Make sure your slave IDs are the same as in aggregator ID. If not it will cause the problem your having. Bad NIC hardware also it's failing over for a reason as the log showed.
They check out. What did help besides running tcpdump forever was to do a 'service network restart'. That made the network behave. I wonder what's going on...