Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
Hello,
1. <domain type='kvm' id='6'> <name>baremetalbrbm_1</name> <uuid>534e9b54-5e4c-4acb-adcf-793f841551a7</uuid> <memory unit='KiB'>104857600</memory> <currentMemory unit='KiB'>104857600</currentMemory> <vcpu placement='static'>36</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> <boot dev='hd'/> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-passthrough'/> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='unsafe'/> <source file='/var/lib/libvirt/images/baremetalbrbm_1.qcow2'/> <backingStore/> <target dev='sda' bus='sata'/> <alias name='sata0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='sata' index='0'> <alias name='sata0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='bridge'> <mac address='00:f1:15:20:c5:46'/> <source network='brbm' bridge='brbm'/> <virtualport type='openvswitch'> <parameters interfaceid='654ad04f-fa0a-41dd-9d30-b84e702462fe'/> </virtualport> <target dev='vnet5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:d3:c9:24'/> <source bridge='br57'/> <target dev='vnet6'/> <model type='rtl8139'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/3'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='5903' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> </domain>
2. [root@overcloud-novacompute-0 ~]# lsmod | grep kvm kvm_intel 162153 70 kvm 525409 1 kvm_intel
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep virt_type|grep -v '^#' virt_type=kvm
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep cpu_mode|grep -v '^#' cpu_mode=host-passthrough
Thanks, Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets bderzhavets@hotmail.com a scris:
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to
appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Reports posted look good for me. Config should provide the best available performance
for cloud VM (L2) on Compute Node.
1. Please, remind me what goes wrong from your standpoint ?
2. Which CPU is installed on Compute Node && how much RAM ?
Actually , my concern is :-
Number_of_ Cloud_VMs versus Number_CPU_Cores ( not threads)
Please, check `top` report in regards of swap area size.
Thanks.
Boris.
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Sunday, August 14, 2016 3:06 PM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
Hello,
1. <domain type='kvm' id='6'> <name>baremetalbrbm_1</name> <uuid>534e9b54-5e4c-4acb-adcf-793f841551a7</uuid> <memory unit='KiB'>104857600</memory> <currentMemory unit='KiB'>104857600</currentMemory> <vcpu placement='static'>36</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> <boot dev='hd'/> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-passthrough'/> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='unsafe'/> <source file='/var/lib/libvirt/images/baremetalbrbm_1.qcow2'/> <backingStore/> <target dev='sda' bus='sata'/> <alias name='sata0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='sata' index='0'> <alias name='sata0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='bridge'> <mac address='00:f1:15:20:c5:46'/> <source network='brbm' bridge='brbm'/> <virtualport type='openvswitch'> <parameters interfaceid='654ad04f-fa0a-41dd-9d30-b84e702462fe'/> </virtualport> <target dev='vnet5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:d3:c9:24'/> <source bridge='br57'/> <target dev='vnet6'/> <model type='rtl8139'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/3'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='5903' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> </domain>
2. [root@overcloud-novacompute-0 ~]# lsmod | grep kvm kvm_intel 162153 70 kvm 525409 1 kvm_intel
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep virt_type|grep -v '^#' virt_type=kvm
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep cpu_mode|grep -v '^#' cpu_mode=host-passthrough
Thanks, Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Hello Borins,
1. So, in about three days after a reboot (this happened several times already) the compute node reports high CPU usage. It has 36 vCPUs and it reports a load higher than 40. Usually the load is about 2 or 3. The VMs qemu-kvm processes reports 100% CPU usage (for a VM with 4 CPU it reports almost 400%, for one with 1 CPU it reports almost 100%). The VMs are not accessible anymore through SSH.
2. The baremetal has 2 CPUs, each with 10 cores and HT activated so it reports 40 CPUs. It has 128 GB RAM out of which 100 GB are for the compute node.
I have 15 VMs running inside compute. They are summing up 40 vCPUs and 92 GB RAM.
There are no swap devices installed on the compute node so the reported SwapTotal is 0 KB.
I'll check is the memory on the compute gets exhausted as soon as the problem reproduces again (in about 2 days) but for now there are more than 80 GB available.
Note that a reboot of the compute node doesn't fix the problem. Only a shutdown of the compute and a virsh start on it works.
Thanks, Laurentiu
În dum., 14 aug. 2016 la 23:27, Boris Derzhavets bderzhavets@hotmail.com a scris:
Reports posted look good for me. Config should provide the best available performance
for cloud VM (L2) on Compute Node.
Please, remind me what goes wrong from your standpoint ?
Which CPU is installed on Compute Node && how much RAM ?
Actually , my concern is :-
Number_of_ Cloud_VMs versus Number_CPU_Cores ( not threads)
Please, check `top` report in regards of swap area size.
Thanks.
Boris.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 3:06 PM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Hello,
<domain type='kvm' id='6'>
<name>baremetalbrbm_1</name> <uuid>534e9b54-5e4c-4acb-adcf-793f841551a7</uuid> <memory unit='KiB'>104857600</memory> <currentMemory unit='KiB'>104857600</currentMemory> <vcpu placement='static'>36</vcpu>
<resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> <boot dev='hd'/> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-passthrough'/> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='unsafe'/> <source file='/var/lib/libvirt/images/baremetalbrbm_1.qcow2'/> <backingStore/> <target dev='sda' bus='sata'/> <alias name='sata0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='sata' index='0'> <alias name='sata0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='bridge'> <mac address='00:f1:15:20:c5:46'/> <source network='brbm' bridge='brbm'/> <virtualport type='openvswitch'> <parameters interfaceid='654ad04f-fa0a-41dd-9d30-b84e702462fe'/> </virtualport> <target dev='vnet5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:d3:c9:24'/> <source bridge='br57'/> <target dev='vnet6'/> <model type='rtl8139'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/3'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='5903' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> </domain>
[root@overcloud-novacompute-0 ~]# lsmod | grep kvm kvm_intel 162153 70 kvm 525409 1 kvm_intel
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep virt_type|grep -v '^#' virt_type=kvm
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep cpu_mode|grep -v '^#' cpu_mode=host-passthrough
Thanks, Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets bderzhavets@hotmail.com a scris:
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to
appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
I would attempt to decrease number of VCPUS allocated to cloud VMs.
Say try 4 => 2 . My guess there is not enough VCPUs to run OS itself.
I also guess CPU model << Haswell. Please , confirm ( or not) if possible.
Since Haswell was launched via my experience Intel Xeons based on this kernel (or latter kernels ) behaves much better then SandyBridge or IvyBridge based.
Boris.
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Monday, August 15, 2016 1:15 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
Hello Borins,
1. So, in about three days after a reboot (this happened several times already) the compute node reports high CPU usage. It has 36 vCPUs and it reports a load higher than 40. Usually the load is about 2 or 3. The VMs qemu-kvm processes reports 100% CPU usage (for a VM with 4 CPU it reports almost 400%, for one with 1 CPU it reports almost 100%). The VMs are not accessible anymore through SSH.
2. The baremetal has 2 CPUs, each with 10 cores and HT activated so it reports 40 CPUs. It has 128 GB RAM out of which 100 GB are for the compute node.
I have 15 VMs running inside compute. They are summing up 40 vCPUs and 92 GB RAM.
There are no swap devices installed on the compute node so the reported SwapTotal is 0 KB.
I'll check is the memory on the compute gets exhausted as soon as the problem reproduces again (in about 2 days) but for now there are more than 80 GB available.
Note that a reboot of the compute node doesn't fix the problem. Only a shutdown of the compute and a virsh start on it works.
Thanks, Laurentiu
În dum., 14 aug. 2016 la 23:27, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Reports posted look good for me. Config should provide the best available performance
for cloud VM (L2) on Compute Node.
1. Please, remind me what goes wrong from your standpoint ?
2. Which CPU is installed on Compute Node && how much RAM ?
Actually , my concern is :-
Number_of_ Cloud_VMs versus Number_CPU_Cores ( not threads)
Please, check `top` report in regards of swap area size.
Thanks.
Boris.
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 3:06 PM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Hello,
1. <domain type='kvm' id='6'> <name>baremetalbrbm_1</name> <uuid>534e9b54-5e4c-4acb-adcf-793f841551a7</uuid> <memory unit='KiB'>104857600</memory> <currentMemory unit='KiB'>104857600</currentMemory> <vcpu placement='static'>36</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> <boot dev='hd'/> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-passthrough'/> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='unsafe'/> <source file='/var/lib/libvirt/images/baremetalbrbm_1.qcow2'/> <backingStore/> <target dev='sda' bus='sata'/> <alias name='sata0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='sata' index='0'> <alias name='sata0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='bridge'> <mac address='00:f1:15:20:c5:46'/> <source network='brbm' bridge='brbm'/> <virtualport type='openvswitch'> <parameters interfaceid='654ad04f-fa0a-41dd-9d30-b84e702462fe'/> </virtualport> <target dev='vnet5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:d3:c9:24'/> <source bridge='br57'/> <target dev='vnet6'/> <model type='rtl8139'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/3'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='5903' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> </domain>
2. [root@overcloud-novacompute-0 ~]# lsmod | grep kvm kvm_intel 162153 70 kvm 525409 1 kvm_intel
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep virt_type|grep -v '^#' virt_type=kvm
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep cpu_mode|grep -v '^#' cpu_mode=host-passthrough
Thanks, Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
The CPUs are IvyBridge microarchitecture, Xeon E5-2670 v2.
I could try lowering the vCPUs but I doubt it would help. Please note that the same VMs are running just fine (with a load of 2 out of an acceptable 36 on the compute node) for about 3 days after a restart.
În lun., 15 aug. 2016 la 08:42, Boris Derzhavets bderzhavets@hotmail.com a scris:
I would attempt to decrease number of VCPUS allocated to cloud VMs.
Say try 4 => 2 . My guess there is not enough VCPUs to run OS itself.
I also guess CPU model << Haswell. Please , confirm ( or not) if possible.
Since Haswell was launched via my experience Intel Xeons based on this kernel (or latter kernels ) behaves much better then SandyBridge or IvyBridge based.
Boris.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Monday, August 15, 2016 1:15 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Hello Borins,
- So, in about three days after a reboot (this happened several times
already) the compute node reports high CPU usage. It has 36 vCPUs and it reports a load higher than 40. Usually the load is about 2 or 3. The VMs qemu-kvm processes reports 100% CPU usage (for a VM with 4 CPU it reports almost 400%, for one with 1 CPU it reports almost 100%). The VMs are not accessible anymore through SSH.
- The baremetal has 2 CPUs, each with 10 cores and HT activated so it
reports 40 CPUs. It has 128 GB RAM out of which 100 GB are for the compute node.
I have 15 VMs running inside compute. They are summing up 40 vCPUs and 92 GB RAM.
There are no swap devices installed on the compute node so the reported SwapTotal is 0 KB.
I'll check is the memory on the compute gets exhausted as soon as the problem reproduces again (in about 2 days) but for now there are more than 80 GB available.
Note that a reboot of the compute node doesn't fix the problem. Only a shutdown of the compute and a virsh start on it works.
Thanks, Laurentiu
În dum., 14 aug. 2016 la 23:27, Boris Derzhavets bderzhavets@hotmail.com a scris:
Reports posted look good for me. Config should provide the best available performance
for cloud VM (L2) on Compute Node.
Please, remind me what goes wrong from your standpoint ?
Which CPU is installed on Compute Node && how much RAM ?
Actually , my concern is :-
Number_of_ Cloud_VMs versus Number_CPU_Cores ( not threads)
Please, check `top` report in regards of swap area size.
Thanks.
Boris.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 3:06 PM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Hello,
<domain type='kvm' id='6'>
<name>baremetalbrbm_1</name> <uuid>534e9b54-5e4c-4acb-adcf-793f841551a7</uuid> <memory unit='KiB'>104857600</memory> <currentMemory unit='KiB'>104857600</currentMemory> <vcpu placement='static'>36</vcpu>
<resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> <boot dev='hd'/> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-passthrough'/> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='unsafe'/> <source file='/var/lib/libvirt/images/baremetalbrbm_1.qcow2'/> <backingStore/> <target dev='sda' bus='sata'/> <alias name='sata0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='sata' index='0'> <alias name='sata0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='bridge'> <mac address='00:f1:15:20:c5:46'/> <source network='brbm' bridge='brbm'/> <virtualport type='openvswitch'> <parameters interfaceid='654ad04f-fa0a-41dd-9d30-b84e702462fe'/> </virtualport> <target dev='vnet5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:d3:c9:24'/> <source bridge='br57'/> <target dev='vnet6'/> <model type='rtl8139'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/3'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='5903' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> </domain>
[root@overcloud-novacompute-0 ~]# lsmod | grep kvm kvm_intel 162153 70 kvm 525409 1 kvm_intel
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep virt_type|grep -v '^#' virt_type=kvm
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep cpu_mode|grep -v '^#' cpu_mode=host-passthrough
Thanks, Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets bderzhavets@hotmail.com a scris:
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to
appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
I would try to put dmesg on Cron running once per hour and attempt to analyze
logs captured.
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Monday, August 15, 2016 2:15 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
The CPUs are IvyBridge microarchitecture, Xeon E5-2670 v2.
I could try lowering the vCPUs but I doubt it would help. Please note that the same VMs are running just fine (with a load of 2 out of an acceptable 36 on the compute node) for about 3 days after a restart.
În lun., 15 aug. 2016 la 08:42, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
I would attempt to decrease number of VCPUS allocated to cloud VMs.
Say try 4 => 2 . My guess there is not enough VCPUs to run OS itself.
I also guess CPU model << Haswell. Please , confirm ( or not) if possible.
Since Haswell was launched via my experience Intel Xeons based on this kernel (or latter kernels ) behaves much better then SandyBridge or IvyBridge based.
Boris.
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Monday, August 15, 2016 1:15 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Hello Borins,
1. So, in about three days after a reboot (this happened several times already) the compute node reports high CPU usage. It has 36 vCPUs and it reports a load higher than 40. Usually the load is about 2 or 3. The VMs qemu-kvm processes reports 100% CPU usage (for a VM with 4 CPU it reports almost 400%, for one with 1 CPU it reports almost 100%). The VMs are not accessible anymore through SSH.
2. The baremetal has 2 CPUs, each with 10 cores and HT activated so it reports 40 CPUs. It has 128 GB RAM out of which 100 GB are for the compute node.
I have 15 VMs running inside compute. They are summing up 40 vCPUs and 92 GB RAM.
There are no swap devices installed on the compute node so the reported SwapTotal is 0 KB.
I'll check is the memory on the compute gets exhausted as soon as the problem reproduces again (in about 2 days) but for now there are more than 80 GB available.
Note that a reboot of the compute node doesn't fix the problem. Only a shutdown of the compute and a virsh start on it works.
Thanks, Laurentiu
În dum., 14 aug. 2016 la 23:27, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Reports posted look good for me. Config should provide the best available performance
for cloud VM (L2) on Compute Node.
1. Please, remind me what goes wrong from your standpoint ?
2. Which CPU is installed on Compute Node && how much RAM ?
Actually , my concern is :-
Number_of_ Cloud_VMs versus Number_CPU_Cores ( not threads)
Please, check `top` report in regards of swap area size.
Thanks.
Boris.
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 3:06 PM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Hello,
1. <domain type='kvm' id='6'> <name>baremetalbrbm_1</name> <uuid>534e9b54-5e4c-4acb-adcf-793f841551a7</uuid> <memory unit='KiB'>104857600</memory> <currentMemory unit='KiB'>104857600</currentMemory> <vcpu placement='static'>36</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> <boot dev='hd'/> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-passthrough'/> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='unsafe'/> <source file='/var/lib/libvirt/images/baremetalbrbm_1.qcow2'/> <backingStore/> <target dev='sda' bus='sata'/> <alias name='sata0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='sata' index='0'> <alias name='sata0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='bridge'> <mac address='00:f1:15:20:c5:46'/> <source network='brbm' bridge='brbm'/> <virtualport type='openvswitch'> <parameters interfaceid='654ad04f-fa0a-41dd-9d30-b84e702462fe'/> </virtualport> <target dev='vnet5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:d3:c9:24'/> <source bridge='br57'/> <target dev='vnet6'/> <model type='rtl8139'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/3'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='5903' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> </domain>
2. [root@overcloud-novacompute-0 ~]# lsmod | grep kvm kvm_intel 162153 70 kvm 525409 1 kvm_intel
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep virt_type|grep -v '^#' virt_type=kvm
[root@overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep cpu_mode|grep -v '^#' cpu_mode=host-passthrough
Thanks, Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets bderzhavets@hotmail.com a scris:
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to
appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Sorry,
How you trigger the problem ?
B.
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Tuesday, August 16, 2016 3:28 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets bderzhavets@hotmail.com a scris:
Sorry,
How you trigger the problem ?
B.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 3:28 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets bderzhavets@hotmail.com a scris:
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to
appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Is KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare metal ) ?
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Tuesday, August 16, 2016 5:25 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Sorry,
How you trigger the problem ?
B.
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 3:28 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Yes. It is on both baremetal and compute node.
În mar., 16 aug. 2016 la 13:37, Boris Derzhavets bderzhavets@hotmail.com a scris:
Is KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare metal ) ?
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 5:25 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets bderzhavets@hotmail.com a scris:
Sorry,
How you trigger the problem ?
B.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 3:28 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets bderzhavets@hotmail.com a scris:
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to
appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
I would enable ksmtuned logging ,if it has been done verify logs
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Tuesday, August 16, 2016 7:16 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
Yes. It is on both baremetal and compute node.
În mar., 16 aug. 2016 la 13:37, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Is KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare metal ) ?
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 5:25 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Sorry,
How you trigger the problem ?
B.
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 3:28 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Enabled the logging on both compute and baremetal. Nothing strange in logs:
on baremetal : Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808 Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903, stop ksm
on compute: Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936 Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492, stop ksm
and the compute node is again at 100% CPU utilization.
În mar., 16 aug. 2016 la 15:26, Boris Derzhavets bderzhavets@hotmail.com a scris:
I would enable ksmtuned logging ,if it has been done verify logs
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 7:16 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Yes. It is on both baremetal and compute node.
În mar., 16 aug. 2016 la 13:37, Boris Derzhavets bderzhavets@hotmail.com a scris:
Is KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare metal ) ?
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 5:25 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets bderzhavets@hotmail.com a scris:
Sorry,
How you trigger the problem ?
B.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 3:28 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets < bderzhavets@hotmail.com> a scris:
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to
appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
It sounds weird, but attempt to disable KSM and see would it help or no ?
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Wednesday, August 17, 2016 4:56 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
Enabled the logging on both compute and baremetal. Nothing strange in logs:
on baremetal : Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808 Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903, stop ksm
on compute: Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936 Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492, stop ksm
and the compute node is again at 100% CPU utilization.
În mar., 16 aug. 2016 la 15:26, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
I would enable ksmtuned logging ,if it has been done verify logs
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 7:16 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Yes. It is on both baremetal and compute node.
În mar., 16 aug. 2016 la 13:37, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Is KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare metal ) ?
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 5:25 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Sorry,
How you trigger the problem ?
B.
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 3:28 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Both baremetal and compute ? Are there any other metrics do you consider useful to collect for troubleshooting purposes ? În mie., 17 aug. 2016 la 13:04, Boris Derzhavets bderzhavets@hotmail.com a scris:
It sounds weird, but attempt to disable KSM and see would it help or no ?
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Wednesday, August 17, 2016 4:56 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Enabled the logging on both compute and baremetal. Nothing strange in logs:
on baremetal : Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808 Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903, stop ksm
on compute: Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936 Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492, stop ksm
and the compute node is again at 100% CPU utilization.
În mar., 16 aug. 2016 la 15:26, Boris Derzhavets bderzhavets@hotmail.com a scris:
I would enable ksmtuned logging ,if it has been done verify logs
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 7:16 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Yes. It is on both baremetal and compute node.
În mar., 16 aug. 2016 la 13:37, Boris Derzhavets bderzhavets@hotmail.com a scris:
Is KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare metal ) ?
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 5:25 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets < bderzhavets@hotmail.com> a scris:
Sorry,
How you trigger the problem ?
B.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 3:28 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets < bderzhavets@hotmail.com> a scris:
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM
to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has *nested KVM* enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
For myself KSM is unpredictable feature. The problem is Compute, just this node
does "copy on write" , so only Compute.
My concern exactly is where would it lead to worse or better Guest behavior ?
I am not expecting complete fix. I would track via top/htop and dmesg via Cron on 1-2 hr
period.
________________________________ From: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro Sent: Wednesday, August 17, 2016 6:38 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
Both baremetal and compute ? Are there any other metrics do you consider useful to collect for troubleshooting purposes ? În mie., 17 aug. 2016 la 13:04, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
It sounds weird, but attempt to disable KSM and see would it help or no ?
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Wednesday, August 17, 2016 4:56 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Enabled the logging on both compute and baremetal. Nothing strange in logs:
on baremetal : Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808 Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903, stop ksm
on compute: Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936 Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492, stop ksm
and the compute node is again at 100% CPU utilization.
În mar., 16 aug. 2016 la 15:26, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
I would enable ksmtuned logging ,if it has been done verify logs
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 7:16 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Yes. It is on both baremetal and compute node.
În mar., 16 aug. 2016 la 13:37, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Is KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare metal ) ?
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 5:25 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
Sorry,
How you trigger the problem ?
B.
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Tuesday, August 16, 2016 3:28 AM
To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets@hotmail.commailto:bderzhavets@hotmail.com> a scris:
________________________________ From: centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org <centos-virt-bounces@centos.orgmailto:centos-virt-bounces@centos.org> on behalf of Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> Sent: Sunday, August 14, 2016 10:17 AM To: Discussion about the virtualization on CentOS Subject: Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
Please, post
1) # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu@soica.romailto:laurentiu@soica.ro> a scris: Hello,
I have an OpenStack setup in virtual environment on CentOS 7.
The baremetal has nested KVM enabled and 1 compute node as a VM.
Inside the compute node I have multiple VMs running.
After about every 3 days the VMs get inaccessible and the compute node reports high CPU usage. The qemu-kvm process for each VM inside the compute node reports full CPU usage.
Please help me with some hints to debug this issue.
Thanks, Laurentiu _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.orgmailto:CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
I've tried with KSM disabled and nothing changed.
I've upgraded KVM to qemu-kvm-ev. I'm waiting to see if there are any improvements and report back.
În mie., 17 aug. 2016 la 15:10, Boris Derzhavets bderzhavets@hotmail.com a scris:
For myself KSM is unpredictable feature. The problem is Compute, just this node
does "copy on write" , so only Compute.
My concern exactly is where would it lead to worse or better Guest behavior ?
I am not expecting complete fix. I would track via top/htop and dmesg via Cron on 1-2 hr
period.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Wednesday, August 17, 2016 6:38 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Both baremetal and compute ? Are there any other metrics do you consider useful to collect for troubleshooting purposes ? În mie., 17 aug. 2016 la 13:04, Boris Derzhavets bderzhavets@hotmail.com a scris:
It sounds weird, but attempt to disable KSM and see would it help or no ?
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Wednesday, August 17, 2016 4:56 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Enabled the logging on both compute and baremetal. Nothing strange in logs:
on baremetal : Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808 Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903, stop ksm
on compute: Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936 Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492, stop ksm
and the compute node is again at 100% CPU utilization.
În mar., 16 aug. 2016 la 15:26, Boris Derzhavets bderzhavets@hotmail.com a scris:
I would enable ksmtuned logging ,if it has been done verify logs
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 7:16 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Yes. It is on both baremetal and compute node.
În mar., 16 aug. 2016 la 13:37, Boris Derzhavets < bderzhavets@hotmail.com> a scris:
Is KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare metal ) ?
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 5:25 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Running the compute node for several days simply triggers it.
În mar., 16 aug. 2016 la 12:12, Boris Derzhavets < bderzhavets@hotmail.com> a scris:
Sorry,
How you trigger the problem ?
B.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 3:28 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Hello,
The issue reproduced again and it doesn't look like a swap problem. Some details:
on the baremetal, from top:
top - 08:08:52 up 5 days, 16:43, 3 users, load average: 36.19, 36.05, 36.05 Tasks: 493 total, 1 running, 492 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.5 us, 87.9 sy, 0.0 ni, 8.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088 buff/cache KiB Swap: 4194300 total, 4073868 free, 120432 used. 56953888 avail Mem
19158 qemu 20 0 0.098t 0.041t 10476 S 3650 35.6 13048:24 qemu-kvm
The compute node has 36 CPUs and the usage is now 100%. There are more than 50 GB of memory still available on the baremetal. The swap is barely used, 120 MB.
On compute node, from top:
top - 05:11:58 up 1 day, 15:08, 2 users, load average: 40.46, 40.49, 40.74
%Cpu(s): 99.1 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 10296246+total, 78079936 free, 23671360 used, 1211160 buff/cache KiB Swap: 0 total, 0 free, 0 used. 78939968 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6032 qemu 20 0 10.601g 1.272g 12964 S 400.0 1.3 588:40.39 qemu-kvm 5673 qemu 20 0 10.602g 1.006g 13020 S 399.7 1.0 1161:47 qemu-kvm 5998 qemu 20 0 10.601g 1.192g 13028 S 367.9 1.2 1544:30 qemu-kvm 5951 qemu 20 0 10.601g 1.246g 13020 S 348.3 1.3 1547:38 qemu-kvm 5750 qemu 20 0 10.599g 990136 13060 S 339.1 1.0 1152:25 qemu-kvm 5752 qemu 20 0 10.598g 1.426g 13040 S 313.9 1.5 663:13.65 qemu-kvm ....
There are more than 70 GB of memory available on the compute node. All VMs are using 100% their CPUs and they are not accessible anymore.
Laurentiu
În dum., 14 aug. 2016 la 21:44, Boris Derzhavets < bderzhavets@hotmail.com> a scris:
*From:* centos-virt-bounces@centos.org < centos-virt-bounces@centos.org> on behalf of Laurentiu Soica < laurentiu@soica.ro> *Sent:* Sunday, August 14, 2016 10:17 AM *To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue
More details on the subject:
I suppose it is a nested KVM issue because it raised after I enabled the nested KVM feature. Without it, anyway, the second level VMs are unusable in terms of performance.
I am using CentOS 7 with:
kernel: 3.10.0-327.22.2.el7.x86_64 qemu-kvm:1.5.3-105.el7_2.4 libvirt:1.2.17-13.el7_2.5
on both the baremetal and the compute VM.
*Please, post*
- # virsh dumpxml VM-L1 ( where on L1 level you expect nested KVM
to appear) 2) Login into VM-L1 and run :- # lsmod | grep kvm 3) I need outputs from VM-L1 ( in case it is Compute Node )
# cat /etc/nova/nova.conf | grep virt_type # cat /etc/nova/nova.conf | grep cpu_mode
Boris.
The only workaround now is to shutdown the compute VM and start it back from baremetal with virsh start. A simple restart of the compute node doesn't help. It looks like the qemu-kvm process corresponding to the compute VM is the problem.
Laurentiu
În dum., 14 aug. 2016 la 00:19, Laurentiu Soica laurentiu@soica.ro a scris:
> Hello, > > I have an OpenStack setup in virtual environment on CentOS 7. > > The baremetal has *nested KVM* enabled and 1 compute node as a VM. > > Inside the compute node I have multiple VMs running. > > After about every 3 days the VMs get inaccessible and the compute > node reports high CPU usage. The qemu-kvm process for each VM inside the > compute node reports full CPU usage. > > Please help me with some hints to debug this issue. > > Thanks, > Laurentiu > _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
No luck with qemu-kvm-ev, the behavior is the same. Running perf record -a -g on the baremetal shows that most of the CPU time is in _raw_spin_lock
Children Self Command Shared Object Symbol
- 93.62% 93.62% qemu-kvm [kernel.kallsyms] [k] _raw_spin_lock
- _raw_spin_lock
+ 45.30% kvm_mmu_sync_roots
+ 28.49% kvm_mmu_load
+ 25.00% mmu_free_roots
+ 1.12% tdp_page_fault
În joi, 18 aug. 2016 la 11:59, Laurentiu Soica laurentiu@soica.ro a scris:
I've tried with KSM disabled and nothing changed.
I've upgraded KVM to qemu-kvm-ev. I'm waiting to see if there are any improvements and report back.
În mie., 17 aug. 2016 la 15:10, Boris Derzhavets bderzhavets@hotmail.com a scris:
For myself KSM is unpredictable feature. The problem is Compute, just this node
does "copy on write" , so only Compute.
My concern exactly is where would it lead to worse or better Guest behavior ?
I am not expecting complete fix. I would track via top/htop and dmesg via Cron on 1-2 hr
period.
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Wednesday, August 17, 2016 6:38 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Both baremetal and compute ? Are there any other metrics do you consider useful to collect for troubleshooting purposes ? În mie., 17 aug. 2016 la 13:04, Boris Derzhavets bderzhavets@hotmail.com a scris:
It sounds weird, but attempt to disable KSM and see would it help or no ?
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Wednesday, August 17, 2016 4:56 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Enabled the logging on both compute and baremetal. Nothing strange in logs:
on baremetal : Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808 Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903, stop ksm
on compute: Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936 Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492, stop ksm
and the compute node is again at 100% CPU utilization.
În mar., 16 aug. 2016 la 15:26, Boris Derzhavets < bderzhavets@hotmail.com> a scris:
I would enable ksmtuned logging ,if it has been done verify logs
*From:* centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro *Sent:* Tuesday, August 16, 2016 7:16 AM
*To:* Discussion about the virtualization on CentOS *Subject:* Re: [CentOS-virt] Nested KVM issue Yes. It is on both baremetal and compute node.
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
--
Laurentiu Soica
Laurentiu,
Just to chip in, without thoroughly checking the thread (which I'd like to apologize in advance for): have you tried to test other kernel instead of the stock one? You can have a look at http://elrepo.org/tiki/kernel-ml and http://elrepo.org/tiki/kernel-lt and use them for some comparison testing. Cheers,
--- Adi Pircalabu
On 22-08-2016 18:31, Laurentiu Soica wrote:
No luck with qemu-kvm-ev, the behavior is the same. Running perf record -a -g on the baremetal shows that most of the CPU time is in _raw_spin_lock
Children Self Command Shared Object Symbol - 93.62% 93.62% qemu-kvm [kernel.kallsyms] [k] _raw_spin_lock
_raw_spin_lock
45.30% kvm_mmu_sync_roots
28.49% kvm_mmu_load
25.00% mmu_free_roots
1.12% tdp_page_fault
În joi, 18 aug. 2016 la 11:59, Laurentiu Soica a scris:
I've tried with KSM disabled and nothing changed.
I've upgraded KVM to qemu-kvm-ev. I'm waiting to see if there are any improvements and report back.
În mie., 17 aug. 2016 la 15:10, Boris Derzhavets bderzhavets@hotmail.com a scris:
For myself KSM is unpredictable feature. The problem is Compute, just this node
does "copy on write" , so only Compute.
My concern exactly is where would it lead to worse or better Guest behavior ?
I am not expecting complete fix. I would track via top/htop and dmesg via Cron on 1-2 hr
period.
FROM: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro
SENT: Wednesday, August 17, 2016 6:38 AM
TO: Discussion about the virtualization on CentOS SUBJECT: Re: [CentOS-virt] Nested KVM issue
Both baremetal and compute ? Are there any other metrics do you consider useful to collect for troubleshooting purposes ?
În mie., 17 aug. 2016 la 13:04, Boris Derzhavets a scris:
It sounds weird, but attempt to disable KSM and see would it help or no ?
FROM: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica
SENT: Wednesday, August 17, 2016 4:56 AM
TO: Discussion about the virtualization on CentOS SUBJECT: Re: [CentOS-virt] Nested KVM issue
Enabled the logging on both compute and baremetal. Nothing strange in logs:
on baremetal :
Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808 Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903, stop ksm
on compute:
Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936 Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492, stop ksm
and the compute node is again at 100% CPU utilization.
În mar., 16 aug. 2016 la 15:26, Boris Derzhavets a scris:
I would enable ksmtuned logging ,if it has been done verify logs
FROM: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica
SENT: Tuesday, August 16, 2016 7:16 AM
TO: Discussion about the virtualization on CentOS SUBJECT: Re: [CentOS-virt] Nested KVM issue
Yes. It is on both baremetal and compute node.
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
--
Laurentiu Soica _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
I've tried the kernel-lt with same results. For kernel-ml, the baremetal wasn't able to boot at that time.
În mar., 23 aug. 2016 la 02:22, Adi Pircalabu adi@ddns.com.au a scris:
Laurentiu,
Just to chip in, without thoroughly checking the thread (which I'd like to apologize in advance for): have you tried to test other kernel instead of the stock one? You can have a look at http://elrepo.org/tiki/kernel-ml and http://elrepo.org/tiki/kernel-lt and use them for some comparison testing. Cheers,
Adi Pircalabu
On 22-08-2016 18:31, Laurentiu Soica wrote:
No luck with qemu-kvm-ev, the behavior is the same. Running perf record -a -g on the baremetal shows that most of the CPU time is in _raw_spin_lock
Children Self Command Shared Object Symbol - 93.62% 93.62% qemu-kvm [kernel.kallsyms] [k] _raw_spin_lock
_raw_spin_lock
45.30% kvm_mmu_sync_roots
28.49% kvm_mmu_load
25.00% mmu_free_roots
1.12% tdp_page_fault
În joi, 18 aug. 2016 la 11:59, Laurentiu Soica a scris:
I've tried with KSM disabled and nothing changed.
I've upgraded KVM to qemu-kvm-ev. I'm waiting to see if there are any improvements and report back.
În mie., 17 aug. 2016 la 15:10, Boris Derzhavets bderzhavets@hotmail.com a scris:
For myself KSM is unpredictable feature. The problem is Compute, just this node
does "copy on write" , so only Compute.
My concern exactly is where would it lead to worse or better Guest behavior ?
I am not expecting complete fix. I would track via top/htop and dmesg via Cron on 1-2 hr
period.
FROM: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica laurentiu@soica.ro
SENT: Wednesday, August 17, 2016 6:38 AM
TO: Discussion about the virtualization on CentOS SUBJECT: Re: [CentOS-virt] Nested KVM issue
Both baremetal and compute ? Are there any other metrics do you consider useful to collect for troubleshooting purposes ?
În mie., 17 aug. 2016 la 13:04, Boris Derzhavets a scris:
It sounds weird, but attempt to disable KSM and see would it help or no ?
FROM: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica
SENT: Wednesday, August 17, 2016 4:56 AM
TO: Discussion about the virtualization on CentOS SUBJECT: Re: [CentOS-virt] Nested KVM issue
Enabled the logging on both compute and baremetal. Nothing strange in logs:
on baremetal :
Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808 Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903, stop ksm
on compute:
Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936 Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492, stop ksm
and the compute node is again at 100% CPU utilization.
În mar., 16 aug. 2016 la 15:26, Boris Derzhavets a scris:
I would enable ksmtuned logging ,if it has been done verify logs
FROM: centos-virt-bounces@centos.org centos-virt-bounces@centos.org on behalf of Laurentiu Soica
SENT: Tuesday, August 16, 2016 7:16 AM
TO: Discussion about the virtualization on CentOS SUBJECT: Re: [CentOS-virt] Nested KVM issue
Yes. It is on both baremetal and compute node.
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
--
Laurentiu Soica _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt