[CentOS-virt] Nested KVM issue

Mon Aug 15 06:15:09 UTC 2016
Laurentiu Soica <laurentiu at soica.ro>

The CPUs are IvyBridge microarchitecture, Xeon E5-2670 v2.

I could try lowering the vCPUs but I doubt it would help. Please note that
the same VMs are running just fine (with a load of 2 out of an acceptable
36 on the compute node) for about 3 days after a restart.

În lun., 15 aug. 2016 la 08:42, Boris Derzhavets <bderzhavets at hotmail.com>
a scris:

> I would attempt to decrease number of VCPUS allocated to cloud VMs.
>
> Say try 4 => 2 . My guess there is not enough VCPUs to run OS itself.
>
> I also guess CPU model << Haswell. Please , confirm ( or not) if possible.
>
> Since Haswell was launched via my experience Intel Xeons   based on this
> kernel (or latter kernels  ) behaves much better then SandyBridge or
> IvyBridge based.
>
> Boris.
>
> ------------------------------
> *From:* centos-virt-bounces at centos.org <centos-virt-bounces at centos.org>
> on behalf of Laurentiu Soica <laurentiu at soica.ro>
> *Sent:* Monday, August 15, 2016 1:15 AM
>
> *To:* Discussion about the virtualization on CentOS
> *Subject:* Re: [CentOS-virt] Nested KVM issue
> Hello Borins,
>
> 1. So, in about three days after a reboot (this happened several times
> already) the compute node reports high CPU usage. It has 36 vCPUs and it
> reports a load higher than 40. Usually the load is about 2 or 3.
> The VMs qemu-kvm processes reports 100% CPU usage (for a VM with 4 CPU it
> reports almost 400%, for one with 1 CPU it reports almost 100%). The VMs
> are not accessible anymore through SSH.
>
> 2. The baremetal has 2 CPUs, each with 10 cores and HT activated so it
> reports 40 CPUs.
> It has 128 GB RAM out of which 100 GB are for the compute node.
>
> I have 15 VMs running inside compute. They are summing up 40 vCPUs and 92
> GB RAM.
>
> There are no swap devices installed on the compute node so the
> reported SwapTotal is 0 KB.
>
> I'll check is the memory on the compute gets exhausted as soon as the
> problem reproduces again (in about 2 days) but for now there are more than
> 80 GB available.
>
> Note that a reboot of the compute node doesn't fix the problem. Only a
> shutdown of the compute and a virsh start on it works.
>
> Thanks,
> Laurentiu
>
> În dum., 14 aug. 2016 la 23:27, Boris Derzhavets <bderzhavets at hotmail.com>
> a scris:
>
>> Reports  posted look good for me.  Config should provide the best
>> available performance
>>
>> for cloud VM (L2) on Compute Node.
>>
>>
>>  1.  Please, remind me what goes  wrong  from your standpoint ?
>>
>>  2. Which CPU is installed on Compute Node && how much RAM ?
>>
>>      Actually , my concern is :-
>>
>>     Number_of_ Cloud_VMs  versus Number_CPU_Cores ( not threads)
>>
>>     Please, check `top`  report   in regards of swap area size.
>>
>>
>> Thanks.
>>
>> Boris.
>> ------------------------------
>> *From:* centos-virt-bounces at centos.org <centos-virt-bounces at centos.org>
>> on behalf of Laurentiu Soica <laurentiu at soica.ro>
>> *Sent:* Sunday, August 14, 2016 3:06 PM
>>
>> *To:* Discussion about the virtualization on CentOS
>> *Subject:* Re: [CentOS-virt] Nested KVM issue
>> Hello,
>>
>> 1. <domain type='kvm' id='6'>
>>   <name>baremetalbrbm_1</name>
>>   <uuid>534e9b54-5e4c-4acb-adcf-793f841551a7</uuid>
>>   <memory unit='KiB'>104857600</memory>
>>   <currentMemory unit='KiB'>104857600</currentMemory>
>>   <vcpu placement='static'>36</vcpu>
>>   <resource>
>>     <partition>/machine</partition>
>>   </resource>
>>   <os>
>>     <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
>>     <boot dev='hd'/>
>>     <bootmenu enable='no'/>
>>   </os>
>>   <features>
>>     <acpi/>
>>     <apic/>
>>     <pae/>
>>   </features>
>>   <cpu mode='host-passthrough'/>
>>   <clock offset='utc'/>
>>   <on_poweroff>destroy</on_poweroff>
>>   <on_reboot>restart</on_reboot>
>>   <on_crash>restart</on_crash>
>>   <devices>
>>     <emulator>/usr/libexec/qemu-kvm</emulator>
>>     <disk type='file' device='disk'>
>>       <driver name='qemu' type='qcow2' cache='unsafe'/>
>>       <source file='/var/lib/libvirt/images/baremetalbrbm_1.qcow2'/>
>>       <backingStore/>
>>       <target dev='sda' bus='sata'/>
>>       <alias name='sata0-0-0'/>
>>       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
>>     </disk>
>>     <controller type='scsi' index='0' model='virtio-scsi'>
>>       <alias name='scsi0'/>
>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
>> function='0x0'/>
>>     </controller>
>>     <controller type='usb' index='0'>
>>       <alias name='usb'/>
>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
>> function='0x2'/>
>>     </controller>
>>     <controller type='pci' index='0' model='pci-root'>
>>       <alias name='pci.0'/>
>>     </controller>
>>     <controller type='sata' index='0'>
>>       <alias name='sata0'/>
>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
>> function='0x0'/>
>>     </controller>
>>     <interface type='bridge'>
>>       <mac address='00:f1:15:20:c5:46'/>
>>       <source network='brbm' bridge='brbm'/>
>>       <virtualport type='openvswitch'>
>>         <parameters interfaceid='654ad04f-fa0a-41dd-9d30-b84e702462fe'/>
>>       </virtualport>
>>       <target dev='vnet5'/>
>>       <model type='virtio'/>
>>       <alias name='net0'/>
>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
>> function='0x0'/>
>>     </interface>
>>     <interface type='bridge'>
>>       <mac address='52:54:00:d3:c9:24'/>
>>       <source bridge='br57'/>
>>       <target dev='vnet6'/>
>>       <model type='rtl8139'/>
>>       <alias name='net1'/>
>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x07'
>> function='0x0'/>
>>     </interface>
>>     <serial type='pty'>
>>       <source path='/dev/pts/3'/>
>>       <target port='0'/>
>>       <alias name='serial0'/>
>>     </serial>
>>     <console type='pty' tty='/dev/pts/3'>
>>       <source path='/dev/pts/3'/>
>>       <target type='serial' port='0'/>
>>       <alias name='serial0'/>
>>     </console>
>>     <input type='mouse' bus='ps2'/>
>>     <input type='keyboard' bus='ps2'/>
>>     <graphics type='vnc' port='5903' autoport='yes' listen='127.0.0.1'>
>>       <listen type='address' address='127.0.0.1'/>
>>     </graphics>
>>     <video>
>>       <model type='cirrus' vram='16384' heads='1'/>
>>       <alias name='video0'/>
>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
>> function='0x0'/>
>>     </video>
>>     <memballoon model='virtio'>
>>       <alias name='balloon0'/>
>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
>> function='0x0'/>
>>     </memballoon>
>>   </devices>
>> </domain>
>>
>> 2.
>> [root at overcloud-novacompute-0 ~]# lsmod | grep kvm
>> kvm_intel             162153  70
>> kvm                   525409  1 kvm_intel
>>
>> [root at overcloud-novacompute-0 ~]# cat /etc/nova/nova.conf | grep
>> virt_type|grep -v '^#'
>> virt_type=kvm
>>
>> [root at overcloud-novacompute-0 ~]#  cat /etc/nova/nova.conf | grep
>>  cpu_mode|grep -v '^#'
>> cpu_mode=host-passthrough
>>
>> Thanks,
>> Laurentiu
>>
>> În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <bderzhavets at hotmail.com>
>> a scris:
>>
>>>
>>>
>>>
>>> ------------------------------
>>> *From:* centos-virt-bounces at centos.org <centos-virt-bounces at centos.org>
>>> on behalf of Laurentiu Soica <laurentiu at soica.ro>
>>> *Sent:* Sunday, August 14, 2016 10:17 AM
>>> *To:* Discussion about the virtualization on CentOS
>>> *Subject:* Re: [CentOS-virt] Nested KVM issue
>>>
>>> More details on the subject:
>>>
>>> I suppose it is a nested KVM issue because it raised after I enabled the
>>> nested KVM feature. Without it, anyway, the second level VMs are unusable
>>> in terms of performance.
>>>
>>> I am using CentOS 7 with:
>>>
>>> kernel: 3.10.0-327.22.2.el7.x86_64
>>> qemu-kvm:1.5.3-105.el7_2.4
>>> libvirt:1.2.17-13.el7_2.5
>>>
>>> on both the baremetal and the compute VM.
>>>
>>> *Please, post*
>>>
>>> 1) # virsh dumpxml  VM-L1  ( where on L1 level you expect nested KVM to
>>> appear)
>>> 2) Login into VM-L1 and run :-
>>>     # lsmod | grep kvm
>>> 3) I need outputs from VM-L1 ( in case it is Compute Node )
>>>
>>> # cat /etc/nova/nova.conf | grep virt_type
>>> # cat /etc/nova/nova.conf | grep  cpu_mode
>>>
>>> Boris.
>>>
>>>
>>>
>>>
>>>
>>> The only workaround now is to shutdown the compute VM and start it back
>>> from baremetal with virsh start.
>>> A simple restart of the compute node doesn't help. It looks like the
>>> qemu-kvm process corresponding to the compute VM is the problem.
>>>
>>> Laurentiu
>>>
>>> În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu at soica.ro> a
>>> scris:
>>>
>>>> Hello,
>>>>
>>>> I have an OpenStack setup in virtual environment on CentOS 7.
>>>>
>>>> The baremetal has *nested KVM* enabled and 1 compute node as a VM.
>>>>
>>>> Inside the compute node I have multiple VMs running.
>>>>
>>>> After about every 3 days the VMs get inaccessible and the compute node
>>>> reports high CPU usage. The qemu-kvm process for each VM inside the compute
>>>> node reports full CPU usage.
>>>>
>>>> Please help me with some hints to debug this issue.
>>>>
>>>> Thanks,
>>>> Laurentiu
>>>>
>>> _______________________________________________
>>> CentOS-virt mailing list
>>> CentOS-virt at centos.org
>>> https://lists.centos.org/mailman/listinfo/centos-virt
>>>
>> _______________________________________________
>> CentOS-virt mailing list
>> CentOS-virt at centos.org
>> https://lists.centos.org/mailman/listinfo/centos-virt
>>
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> https://lists.centos.org/mailman/listinfo/centos-virt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20160815/cf21faf3/attachment-0006.html>