[CentOS-virt] libvirtd hang on CentOS6 after latest updates

Tue May 22 09:33:08 UTC 2018
Karel Hendrych <k+centosvirt at karlos.cz>

Hi, I am seeing frequent libvirtd hangs (clients not responding) after 
last CentOS6-Xen update :

libvirt-libs-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-network-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-nwfilter-4.1.0-2.xen46.el6.x86_64
libgcc-4.4.7-18.el6_9.2.x86_64
2:qemu-img-0.12.1.2-2.503.el6_9.5.x86_64
libvirt-daemon-driver-storage-core-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-secret-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-interface-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-nodedev-4.1.0-2.xen46.el6.x86_64
10:centos-release-xen-common-8-4.el6.x86_64
xen-licenses-4.6.6-12.el6.x86_64
xen-libs-4.6.6-12.el6.x86_64
libvirt-daemon-driver-libxl-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-xen-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-qemu-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-storage-gluster-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-storage-logical-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-storage-mpath-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-storage-disk-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-storage-scsi-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-storage-iscsi-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-storage-4.1.0-2.xen46.el6.x86_64
libstdc++-4.4.7-18.el6_9.2.x86_64
libvirt-daemon-config-nwfilter-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-config-network-4.1.0-2.xen46.el6.x86_64
libvirt-daemon-driver-lxc-4.1.0-2.xen46.el6.x86_64
libvirt-client-4.1.0-2.xen46.el6.x86_64
linux-firmware-20171215-82.git2451bb22.el6.noarch
12:dhcp-common-4.1.1-53.P1.el6.centos.4.x86_64
12:dhclient-4.1.1-53.P1.el6.centos.4.x86_64
libvirt-4.1.0-2.xen46.el6.x86_64
10:centos-release-xen-46-8-4.el6.x86_64
10:centos-release-xen-44-8-4.el6.x86_64
tzdata-2018e-3.el6.noarch
libgomp-4.4.7-18.el6_9.2.x86_64
kernel-4.9.86-30.el6.x86_64
xen-hypervisor-4.6.6-12.el6.x86_64
xen-runtime-4.6.6-12.el6.x86_64
xen-4.6.6-12.el6.x86_64
libvirt-daemon-xen-4.1.0-2.xen46.el6.x86_64

Remedy is to kill -9 libvirtd and start again. Issue can be replicated 
within few domU starts. Usually libvirtd hangs when domU is bringing up 
xen drivers or something around udev, like:

xen_netfront: Initialising Xen virtual ethernet driver

I've been looking into libvirtd strace and debug logs, so far most 
suspicious in libvirtd debug log is this:

libvirtd.log:2018-05-22 08:32:44.760+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/tx-7'
libvirtd.log:2018-05-22 08:32:44.761+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/tx-6'
libvirtd.log:2018-05-22 08:32:44.761+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/tx-4'
libvirtd.log:2018-05-22 08:32:44.762+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/tx-5'
libvirtd.log:2018-05-22 08:32:44.763+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/tx-2'
libvirtd.log:2018-05-22 08:32:44.764+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/tx-3'
libvirtd.log:2018-05-22 08:32:44.765+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/rx-6'
libvirtd.log:2018-05-22 08:32:44.766+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/rx-5'
libvirtd.log:2018-05-22 08:32:44.767+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/rx-4'
libvirtd.log:2018-05-22 08:32:44.767+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/rx-7'
libvirtd.log:2018-05-22 08:32:44.768+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/rx-2'
libvirtd.log:2018-05-22 08:32:44.769+0000: 25455: debug : 
udevRemoveOneDevice:1289 : Failed to find device to remove that has udev 
name '/sys/devices/vif-24-0/net/vif24.0/queues/rx-3'

I could not get rid of this by reducing amount of driver queues (not 
sure if that applies to PV)

Is someone out there seeing similar issues? Anyone perhaps interested in 
reviewing full debug log / strace ?

Cheers
Karel