[CentOS-virt] Loss of DHCP destroys bridge

Mon Jun 26 23:11:59 UTC 2017
Ken Bass <kbass at kenbass.com>

I am having trouble with recovery. Today due to electrical work I 
powered down my networks Router / DHCP server.

My Centos 7 host machines lost their DHCP lease (they are actually 
static leases). Once I power my Router / DHCP server back up none of my 
virtual machine were accessible. It appears that when the DHCP lease was 
lost on the virt server the bridge of the KVM guest were all losts. A 
'br show' had none of the guests anymore. So each guest was sending out 
DHCP requests as expected but since the guest was no longer bridged the 
packets went no-where.

How do I handle proper recovery from this?

The sequence on the host seemed to be:

07:00:00 I powered off the DHCP server and router for electrical work

Jun 26 10:08:02 vm-server dhclient[973]: DHCPREQUEST on br0 to 
192.168.2.1 port 67 (xid=0x7627186d)

requests continue for quite some hours and then NetworkManager chimes in.

Jun 26 14:55:23 vm-server NetworkManager[840]: <info> [1498503323.1518] 
dhcp4 (br0): state changed bound -> expire

Jun 26 14:55:23 vm-server NetworkManager[840]: <info> [1498503323.1681] 
dhcp4 (br0): canceled DHCP transaction, DHCP client pid 973
Jun 26 14:55:23 vm-server NetworkManager[840]: <info> [1498503323.1681] 
dhcp4 (br0): state changed expire -> done
Jun 26 14:55:23 vm-server NetworkManager[840]: <info> [1498503323.1684] 
device (br0): scheduling DHCPv4 restart in 120 seconds, 3 tries left 
(reason: lease expired)

(looks like 3 tries of something)

Jun 26 15:00:52 vm-server NetworkManager[840]: <warn> [1498503652.9531] 
dhcp4 (br0): request timed out
Jun 26 15:00:52 vm-server NetworkManager[840]: <info> [1498503652.9532] 
dhcp4 (br0): state changed unknown -> timeout
Jun 26 15:00:52 vm-server NetworkManager[840]: <info> [1498503652.9694] 
dhcp4 (br0): canceled DHCP transaction, DHCP client pid 15937
Jun 26 15:00:52 vm-server NetworkManager[840]: <info> [1498503652.9694] 
dhcp4 (br0): state changed timeout -> done
Jun 26 15:00:52 vm-server NetworkManager[840]: <info> [1498503652.9697] 
device (br0): scheduling DHCPv4 restart in 120 seconds, 1 tries left 
(reason: lease expired)
Jun 26 15:02:52 vm-server NetworkManager[840]: <info> [1498503772.9548] 
dhcp4 (br0): activation: beginning transaction (timeout in 45 seconds)
Jun 26 15:02:52 vm-server NetworkManager[840]: <info> [1498503772.9566] 
dhcp4 (br0): dhclient started with pid 15994
Jun 26 15:02:52 vm-server dhclient[15994]: DHCPDISCOVER on br0 to 
255.255.255.255 port 67 interval 4 (xid=0x54083ff1)
Jun 26 15:03:37 vm-server NetworkManager[840]: <warn> [1498503817.9818] 
dhcp4 (br0): request timed out
Jun 26 15:03:37 vm-server NetworkManager[840]: <info> [1498503817.9819] 
dhcp4 (br0): state changed unknown -> timeout
Jun 26 15:03:37 vm-server NetworkManager[840]: <info> [1498503817.9981] 
dhcp4 (br0): canceled DHCP transaction, DHCP client pid 15994
Jun 26 15:03:37 vm-server NetworkManager[840]: <info> [1498503817.9981] 
dhcp4 (br0): state changed timeout -> done
Jun 26 15:03:37 vm-server NetworkManager[840]: <info> [1498503817.9984] 
device (br0): state change: activated -> failed (reason 
'ip-config-unavailable') [100 120 5]
Jun 26 15:03:37 vm-server NetworkManager[840]: <info> [1498503817.9987] 
manager: NetworkManager state is now CONNECTED_LOCAL
Jun 26 15:03:38 vm-server NetworkManager[840]: <warn> [1498503818.0457] 
device (br0): Activation: failed for connection 'br0'
Jun 26 15:03:38 vm-server kernel: device enp3s0 left promiscuous mode
Jun 26 15:03:38 vm-server kernel: br0: port 1(enp3s0) entered disabled state
Jun 26 15:03:38 vm-server NetworkManager[840]: <info> [1498503818.0473] 
device (br0): detached bridge port enp3s0
Jun 26 15:03:38 vm-server kernel: device vnet0 left promiscuous mode
Jun 26 15:03:38 vm-server kernel: br0: port 2(vnet0) entered disabled state
Jun 26 15:03:38 vm-server NetworkManager[840]: <info> [1498503818.0483] 
device (br0): detached bridge port vnet0
Jun 26 15:03:38 vm-server kernel: device vnet2 left promiscuous mode
Jun 26 15:03:38 vm-server kernel: br0: port 3(vnet2) entered disabled state
Jun 26 15:03:38 vm-server NetworkManager[840]: <info> [1498503818.0493] 
device (br0): detached bridge port vnet2
Jun 26 15:03:38 vm-server kernel: device vnet3 left promiscuous mode

When I powered back on at:

Jun 26 16:45:29 vm-server dhclient[26539]: DHCPOFFER from 192.168.2.1
Jun 26 16:45:29 vm-server dhclient[26539]: DHCPACK from 192.168.2.1 
(xid=0x4c7b48be)

None of the bridges that NetworkManager decided to detach were restored.

Is this the case of being bitten by NetworkManager once again? Not sure 
why I am running it other than it is the default and I don't remember if 
it is responsible for setting up the bridges.