[CentOS] network intermitent, not sure if virtualization issue- in progress

Tue Feb 7 20:58:55 UTC 2012
Bob Hoffman <bob at bobhoffman.com>

Last post on this, sorta solved.

original post:
-----------------------------------

I have a computer I am using to host a virtual machine.
Centos 6, for both, 64 bit.
The host machine's network connection seems fine. No problems.
Trying to access the virtual machine is usually fine.
but then, poof, ssh, http, ftp, all lose connection for about a minute.
Then they come back up.
I looked in all the logs on both machines, could find nothing, but not
sure where to look.
My question, would this be a setting on the VM as a webserver, some new
centos 6 setting that just
times out network when not in use? Or something that I did when I bonded
my eth ports and bridged them?
The bond covers the two onboard eth ports and one port from an add on
network card.
It is intermittent, seems to happen whenever, but service network
restart on the webserver
seems to fix it immediately, but it also just fixes itself too.
is there some setting with centos 6 that must be changed to allow
constant 'uptime' of the network?
------------------------------

I took out the bond and found that was the issue. works fine without it.
However, I also brought up a second vm and found something interesting.

1- with two vms, only one failed, the other stayed up 100% of the time.
2- second NIC card was not working well, but even taken out did not 
solve issue.
3- pinging system I found the vm that brought up vnet0 had the exact 
same pings as the host, the vnet1 vm had double.
4- no matter what order the vms were brought up, whichever got assigned 
libvirts vnet0 would fail, the other would not fail at all.

5- the ping of the host and the vnet0 assigned VM were exactly the same 
every time, the vnet1 vm was a little more than double that (12ms versus 
28ms).

6- the host never lost connection, but is using the same bridge and bond 
to connect.

It has become logical in my thought process that the host and the first 
vm are somehow in conflict, and the host wins....via the bond software. 
It seems like with vms, the host should not be connected to the bond and 
that might work. But I am way too over this to test it out.

Sharing the bridge and the bond makes me feel the first virtual machine 
brought up, assigned libvirt's vnet1 eventually lost some arp contest to 
the host.

A third vm was added, never failed if not brought up first, and had the 
same ping rate as the vnet1, double the host and the vnet0 virtual machine.

What is causing that is beyond my knowledge and is for experts on 
libivrt's vnet system, bond software, and possibly eth bridges. All I 
know is the host never failed even though it was using the same 
bond/bridge and maybe that is the real issue. In a vm environment maybe 
the host should have its on connection NOT on the bond shared by the VMs?

Using physical bridges may have confused bond with that first vm coming 
up.....

well, that is a long couple weeks work. RIght now I am just going to 
assign the eths direct to the bridge and forget bonding as  really bad 
nightmare.
I hope someone tests this out a bit and comes up with a brilliant yet 
really techy solution.