I put this page together just so I won't spam the board anymore begging for help..lol http://bobhoffman.com/vmissue.html
This shows a working effort of bonded eths, bridged into a vm, and a few other things. The only missing thing is something on the host that ends up putting the VM internet connection into some kind of limbo.
Whether it is hardware related, bug related, libvirt nat related, I don't know. I will only post here on this issue again if it ever gets solved. At this point the server is a no go and getting shelved until I can find a tech that knows this stuff and can fix it.
right now: unsolvable.
I may just put some websites on the host computer until I can find a reliable way of keeping the virtual guest connection 100% up.
Hope this helps someone wanting to bridge or bond.
bob
On 07-02-12 04:28, Bob Hoffman wrote:
I put this page together just so I won't spam the board anymore begging for help..lol http://bobhoffman.com/vmissue.html
According to http://wiki.centos.org/TipsAndTricks/BondingInterfaces there should not be a HWADDR=<mac_address> in ifcfg-eth0.
Regards, Patrick
On Mon, Feb 6, 2012 at 23:22, Patrick Lists centos-list@puzzled.xs4all.nl wrote:
On 07-02-12 04:28, Bob Hoffman wrote:
I put this page together just so I won't spam the board anymore begging for help..lol http://bobhoffman.com/vmissue.html
According to http://wiki.centos.org/TipsAndTricks/BondingInterfaces there should not be a HWADDR=<mac_address> in ifcfg-eth0.
Regards, Patrick
I second that.
What may be happening is that the VM host, which is explicitly set to use br0 as it's main host interface, works fine when bond0 communicates using the eth0 interface, and maintains the connection while eth0's MAC is active for br0 ip address, but at some point outbound traffic from the VM host ceases and the MAC address for eth0 times out of br0. At that point inbound traffic can go to whichever interface answers first in bond0, and if it is eth1, traffic will time out since the main host is probably only using eth0 and not the br0 as its own interface. I would be curious to know what the main hosts routing table has in it, if it is using eth0 or br0 as it's communication interface.
try these:
# route -n # ip route show table all
On 02/06/2012 07:28 PM, Bob Hoffman wrote:
I put this page together just so I won't spam the board anymore begging for help..lol http://bobhoffman.com/vmissue.html
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Virtualiza...
Your page doesn't include your sysctl.conf, so the information available makes it look like your guests are subject to firewall rules on the VM host, none of which allow access to them.
Have you tried disabling netfilter on the bridge device, as documented above?
If that doesn't help, I'm curious about the problem. Contact me off list.
Hi all, thanks for taking an interest. been populating the page with all the data I could.
as someone who sent me a mail noticed that there was still some networkmanager stuff in the log messages. I have disabled it before..and did so again, fresh reboot and there it was.
I am using startx to enter into a mini desktop on the unit (via impi) and that seemed to restart networkmanager for some odd reason.
I cleaned it up some more and am waiting to see if it fails. (it can take up to one hour to fail, so I try to only do 3 or 4 things at once and wait til it goes...sigh.)
http://bobhoffman.com/vmissue.html
I tried with and without hwaddress in the files and found no difference. I took them out again for this try and using mode 6 which is set to rewrite. A number of the modes required switch support according to docs and I tried testing them out but using hwaddress to get around it (well, worth a shot).
the worst part is the depth of time it takes. Sometimes almost an hour goes by and I say 'eureka!!' and continue working on the server, proud of myself...then 'bam' no connect....lol
screwing around on the vm got me to screw up the network on there now..seems eth0 does not exist and it wants eth2...lol sigh.
well, had to add something to it.
I found out I was having an issue with the addon ethernet card (e1000) 'link detected no' and it not working. Took it out? Yep? Work? No.....
However, I did add a second vm and something interesting is happening....
one vm stays up, one will crash...the one that stays does not die.
I am thinking that the vnet0 that comes up is messed up and I need to reset it somehow. Or...something else....but one staying up while other goes down is rather odd.
very strange.
I have no idea if this is the source of your problem (I wasn't using bonded interfaces), but it's sufficiently similar that you might want to try it.
I had a lot of problems with the network stack on VMs, both under VMWare ESXi and Xen where the network would just go numb. After a lot of splunking I determined that it seemed to be related to faulty TCP segment offload. Generally speaking, between the VM, the virtual NICs, the hypervisor/host, and the physical network card, some levels figured that they'd offload segmentation handling to a lower layer, the lower layer wasn't doing it, and the upper layer thought that it was.
Under low network load everything seemed fine but as the network got pushed things would blow up and go numb.
Turning off TSO in the VM seemed to do the trick, although I think in the Xen case I turned it off in the host as well.
The basic command is: /sbin/ethtool -K ethX tso off
While I had the above command in rc.local, I would also run the attached script in /etc/cron.hourly as there were some circumstances where tso would get reenabled.
Good luck
Devin
Devin Reade gdr@gno.org wrote:
[...]
While I had the above command in rc.local, I would also run the attached script in /etc/cron.hourly as there were some circumstances where tso would get reenabled.
And in case attachments get stripped on the mailing list, you can also get the script here:
ftp://ftp.gno.org/pub/tools/force-tso
Devin
On 02/07/2012 07:26 PM, Devin Reade wrote:
I had a lot of problems with the network stack on VMs, both under VMWare ESXi and Xen where the network would just go numb. After a lot of splunking I determined that it seemed to be related to faulty TCP segment offload.
Yeah, wow. You just jogged my memory. Intel 82573(V/L/E) ethernet adapters had a serious bug that would cause TX hangs:
http://downloadmirror.intel.com/9180/eng/README.txt "82573(V/L/E) TX Unit Hang Messages"
Bob, what model cards did you have in your server?
======================================= *Gordon Messmer* wrote
On 02/07/2012 07:26 PM, Devin Reade wrote:
/ I had a lot of problems with the network stack on VMs, both under
/>/ VMWare ESXi and Xen where the network would just go numb. After a />/ lot of splunking I determined that it seemed to be related to />/ faulty TCP segment offload. / Yeah, wow. You just jogged my memory. Intel 82573(V/L/E) ethernet adapters had a serious bug that would cause TX hangs:
http://downloadmirror.intel.com/9180/eng/README.txt "82573(V/L/E) TX Unit Hang Messages"
Bob, what model cards did you have in your server?
===================================== http://www.supermicro.com/products/system/2U/6026/SYS-6026T-NTR_.cfm
IntelĀ® 82576 Dual-Port Gigabit Ethernet Controller (though I think this is basically e1000)
On 02/06/2012 09:28 PM, Bob Hoffman wrote:
I put this page together just so I won't spam the board anymore begging for help..lol http://bobhoffman.com/vmissue.html
You're using bonding mode 0, which may not work when attached to a bridge. Try changing to mode 1 and playing with the cables. If every- thing works with mode 1, you've got an idea on where to focus.
As far as active/active bonding modes go, I know that mode 4 (LACP) is supposed to work, but that requires support on the switch(es).
Although it was written in the context of Xen, you might also want to have a look at the netloop nloopbacks parameter as described in http://www.novell.com/communities/node/4094/xen-network-bridges-explained-with-troubleshooting-notes. On a Xen cluster with 3 physical interfaces per node I had to increase that parameter to keep interfaces from going numb.
I don't know how this translates to the libvirt/kvm world.
Devin