[CentOS-virt] Nested KVM issue

Thu Aug 18 08:59:28 UTC 2016
Laurentiu Soica <laurentiu at soica.ro>

I've tried with KSM disabled and nothing changed.

I've upgraded KVM to qemu-kvm-ev. I'm waiting to see if there are any
improvements and report back.

În mie., 17 aug. 2016 la 15:10, Boris Derzhavets <bderzhavets at hotmail.com>
a scris:

> For myself KSM is unpredictable feature. The problem is Compute, just this
> node
>
> does "copy on write" , so only Compute.
>
> My concern exactly is where would it lead to worse or better Guest
> behavior ?
>
> I am not expecting complete fix.  I would track via top/htop  and dmesg
> via Cron on 1-2 hr
>
> period.
>
>
> ------------------------------
> *From:* centos-virt-bounces at centos.org <centos-virt-bounces at centos.org>
> on behalf of Laurentiu Soica <laurentiu at soica.ro>
> *Sent:* Wednesday, August 17, 2016 6:38 AM
>
> *To:* Discussion about the virtualization on CentOS
> *Subject:* Re: [CentOS-virt] Nested KVM issue
> Both baremetal and compute ? Are there any other metrics do you consider
> useful to collect for troubleshooting purposes ?
> În mie., 17 aug. 2016 la 13:04, Boris Derzhavets <bderzhavets at hotmail.com>
> a scris:
>
>> It sounds weird, but attempt to disable KSM and see would it help or no ?
>>
>>
>> ------------------------------
>> *From:* centos-virt-bounces at centos.org <centos-virt-bounces at centos.org>
>> on behalf of Laurentiu Soica <laurentiu at soica.ro>
>> *Sent:* Wednesday, August 17, 2016 4:56 AM
>>
>> *To:* Discussion about the virtualization on CentOS
>> *Subject:* Re: [CentOS-virt] Nested KVM issue
>> Enabled the logging on both compute and baremetal. Nothing strange in
>> logs:
>>
>> on baremetal :
>> Wed Aug 17 11:51:01 EEST 2016: committed 62310764 free 58501808
>> Wed Aug 17 11:51:01 EEST 2016: 87025667 < 123574516 and free > 24714903,
>> stop ksm
>>
>> on compute:
>> Wed Aug 17 08:52:52 UTC 2016: committed 24547132 free 76730936
>> Wed Aug 17 08:52:52 UTC 2016: 45139624 < 102962460 and free > 20592492,
>> stop ksm
>>
>> and the compute node is again at 100% CPU utilization.
>>
>>
>>
>> În mar., 16 aug. 2016 la 15:26, Boris Derzhavets <bderzhavets at hotmail.com>
>> a scris:
>>
>>> I would enable ksmtuned logging ,if it has been done verify logs
>>>
>>>
>>> ------------------------------
>>> *From:* centos-virt-bounces at centos.org <centos-virt-bounces at centos.org>
>>> on behalf of Laurentiu Soica <laurentiu at soica.ro>
>>> *Sent:* Tuesday, August 16, 2016 7:16 AM
>>>
>>> *To:* Discussion about the virtualization on CentOS
>>> *Subject:* Re: [CentOS-virt] Nested KVM issue
>>> Yes. It is on both baremetal and compute node.
>>>
>>> În mar., 16 aug. 2016 la 13:37, Boris Derzhavets <
>>> bderzhavets at hotmail.com> a scris:
>>>
>>>> Is  KSM enabled on your Compute Nodes ( presuming CentOS 7.2 on bare
>>>> metal ) ?
>>>>
>>>>
>>>> ------------------------------
>>>> *From:* centos-virt-bounces at centos.org <centos-virt-bounces at centos.org>
>>>> on behalf of Laurentiu Soica <laurentiu at soica.ro>
>>>> *Sent:* Tuesday, August 16, 2016 5:25 AM
>>>>
>>>> *To:* Discussion about the virtualization on CentOS
>>>> *Subject:* Re: [CentOS-virt] Nested KVM issue
>>>> Running the compute node for several days simply triggers it.
>>>>
>>>> În mar., 16 aug. 2016 la 12:12, Boris Derzhavets <
>>>> bderzhavets at hotmail.com> a scris:
>>>>
>>>>> Sorry,
>>>>>
>>>>> How you trigger the problem ?
>>>>>
>>>>> B.
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> *From:* centos-virt-bounces at centos.org <centos-virt-bounces at centos.org>
>>>>> on behalf of Laurentiu Soica <laurentiu at soica.ro>
>>>>> *Sent:* Tuesday, August 16, 2016 3:28 AM
>>>>>
>>>>> *To:* Discussion about the virtualization on CentOS
>>>>> *Subject:* Re: [CentOS-virt] Nested KVM issue
>>>>> Hello,
>>>>>
>>>>> The issue reproduced again and it doesn't look like a swap problem.
>>>>> Some details:
>>>>>
>>>>> on the baremetal, from top:
>>>>>
>>>>> top - 08:08:52 up 5 days, 16:43,  3 users,  load average: 36.19,
>>>>> 36.05, 36.05
>>>>> Tasks: 493 total,   1 running, 492 sleeping,   0 stopped,   0 zombie
>>>>>
>>>>> %Cpu(s):  3.5 us, 87.9 sy,  0.0 ni,  8.6 id,  0.0 wa,  0.0 hi,  0.0
>>>>> si,  0.0 st
>>>>> KiB Mem : 12357451+total, 14296000 free, 65634428 used, 43644088
>>>>> buff/cache
>>>>> KiB Swap:  4194300 total,  4073868 free,   120432 used. 56953888 avail
>>>>> Mem
>>>>>
>>>>> 19158 qemu      20   0  0.098t 0.041t  10476 S  3650 35.6  13048:24
>>>>> qemu-kvm
>>>>>
>>>>> The compute node has 36 CPUs and the usage is now 100%. There are more
>>>>> than 50 GB of memory still available on the baremetal. The swap is barely
>>>>> used, 120 MB.
>>>>>
>>>>> On compute node, from top:
>>>>>
>>>>> top - 05:11:58 up 1 day, 15:08,  2 users,  load average: 40.46, 40.49,
>>>>> 40.74
>>>>>
>>>>> %Cpu(s): 99.1 us,  0.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.1
>>>>> si,  0.1 st
>>>>> KiB Mem : 10296246+total, 78079936 free, 23671360 used,  1211160
>>>>> buff/cache
>>>>> KiB Swap:        0 total,        0 free,        0 used. 78939968 avail
>>>>> Mem
>>>>>
>>>>>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
>>>>> COMMAND
>>>>>  6032 qemu      20   0 10.601g 1.272g  12964 S 400.0  1.3 588:40.39
>>>>> qemu-kvm
>>>>>  5673 qemu      20   0 10.602g 1.006g  13020 S 399.7  1.0   1161:47
>>>>> qemu-kvm
>>>>>  5998 qemu      20   0 10.601g 1.192g  13028 S 367.9  1.2   1544:30
>>>>> qemu-kvm
>>>>>  5951 qemu      20   0 10.601g 1.246g  13020 S 348.3  1.3   1547:38
>>>>> qemu-kvm
>>>>>  5750 qemu      20   0 10.599g 990136  13060 S 339.1  1.0   1152:25
>>>>> qemu-kvm
>>>>>  5752 qemu      20   0 10.598g 1.426g  13040 S 313.9  1.5 663:13.65
>>>>> qemu-kvm
>>>>> ....
>>>>>
>>>>> There are more than 70 GB of memory available on the compute node. All
>>>>> VMs are using 100% their CPUs and they are not accessible anymore.
>>>>>
>>>>> Laurentiu
>>>>>
>>>>> În dum., 14 aug. 2016 la 21:44, Boris Derzhavets <
>>>>> bderzhavets at hotmail.com> a scris:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------
>>>>>> *From:* centos-virt-bounces at centos.org <
>>>>>> centos-virt-bounces at centos.org> on behalf of Laurentiu Soica <
>>>>>> laurentiu at soica.ro>
>>>>>> *Sent:* Sunday, August 14, 2016 10:17 AM
>>>>>> *To:* Discussion about the virtualization on CentOS
>>>>>> *Subject:* Re: [CentOS-virt] Nested KVM issue
>>>>>>
>>>>>> More details on the subject:
>>>>>>
>>>>>> I suppose it is a nested KVM issue because it raised after I enabled
>>>>>> the nested KVM feature. Without it, anyway, the second level VMs are
>>>>>> unusable in terms of performance.
>>>>>>
>>>>>> I am using CentOS 7 with:
>>>>>>
>>>>>> kernel: 3.10.0-327.22.2.el7.x86_64
>>>>>> qemu-kvm:1.5.3-105.el7_2.4
>>>>>> libvirt:1.2.17-13.el7_2.5
>>>>>>
>>>>>> on both the baremetal and the compute VM.
>>>>>>
>>>>>> *Please, post*
>>>>>>
>>>>>> 1) # virsh dumpxml  VM-L1  ( where on L1 level you expect nested KVM
>>>>>> to appear)
>>>>>> 2) Login into VM-L1 and run :-
>>>>>>     # lsmod | grep kvm
>>>>>> 3) I need outputs from VM-L1 ( in case it is Compute Node )
>>>>>>
>>>>>> # cat /etc/nova/nova.conf | grep virt_type
>>>>>> # cat /etc/nova/nova.conf | grep  cpu_mode
>>>>>>
>>>>>> Boris.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The only workaround now is to shutdown the compute VM and start it
>>>>>> back from baremetal with virsh start.
>>>>>> A simple restart of the compute node doesn't help. It looks like the
>>>>>> qemu-kvm process corresponding to the compute VM is the problem.
>>>>>>
>>>>>> Laurentiu
>>>>>>
>>>>>> În dum., 14 aug. 2016 la 00:19, Laurentiu Soica <laurentiu at soica.ro>
>>>>>> a scris:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I have an OpenStack setup in virtual environment on CentOS 7.
>>>>>>>
>>>>>>> The baremetal has *nested KVM* enabled and 1 compute node as a VM.
>>>>>>>
>>>>>>> Inside the compute node I have multiple VMs running.
>>>>>>>
>>>>>>> After about every 3 days the VMs get inaccessible and the compute
>>>>>>> node reports high CPU usage. The qemu-kvm process for each VM inside the
>>>>>>> compute node reports full CPU usage.
>>>>>>>
>>>>>>> Please help me with some hints to debug this issue.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Laurentiu
>>>>>>>
>>>>>> _______________________________________________
>>>>>> CentOS-virt mailing list
>>>>>> CentOS-virt at centos.org
>>>>>> https://lists.centos.org/mailman/listinfo/centos-virt
>>>>>>
>>>>> _______________________________________________
>>>>> CentOS-virt mailing list
>>>>> CentOS-virt at centos.org
>>>>> https://lists.centos.org/mailman/listinfo/centos-virt
>>>>>
>>>> _______________________________________________
>>>> CentOS-virt mailing list
>>>> CentOS-virt at centos.org
>>>> https://lists.centos.org/mailman/listinfo/centos-virt
>>>>
>>> _______________________________________________
>>> CentOS-virt mailing list
>>> CentOS-virt at centos.org
>>> https://lists.centos.org/mailman/listinfo/centos-virt
>>>
>> _______________________________________________
>> CentOS-virt mailing list
>> CentOS-virt at centos.org
>> https://lists.centos.org/mailman/listinfo/centos-virt
>>
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> https://lists.centos.org/mailman/listinfo/centos-virt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20160818/5ed574c3/attachment-0006.html>