[CentOS-virt] kernel-4.9.37-29.el7 (and el6)

Mon Jul 24 21:04:13 UTC 2017
Johnny Hughes <johnny at centos.org>

On 07/24/2017 03:05 PM, Kevin Stange wrote:
> On 07/20/2017 03:14 PM, Piotr Gackiewicz wrote:
>> On Thu, 20 Jul 2017, Kevin Stange wrote:
>>
>>> On 07/20/2017 05:31 AM, Piotr Gackiewicz wrote:
>>>> On Wed, 19 Jul 2017, Johnny Hughes wrote:
>>>>
>>>>> On 07/19/2017 09:23 AM, Johnny Hughes wrote:
>>>>>> On 07/19/2017 04:27 AM, Piotr Gackiewicz wrote:
>>>>>>> On Mon, 17 Jul 2017, Johnny Hughes wrote:
>>>>>>>
>>>>>>>> Are the testing kernels (kernel-4.9.37-29.el7 and
>>>>>>>> kernel-4.9.37-29.el6,
>>>>>>>> with the one config file change) working for everyone:
>>>>>>>>
>>>>>>>> (turn off: CONFIG_IO_STRICT_DEVMEM)
>>>>>>>
>>>>>>> Hello.
>>>>>>> Maybe it's not the most appropriate thread or time, but I have been
>>>>>>> signalling it before:
>>>>>>>
>>>>>>> 4.9.* kernels do not work well for me any more (and for other people
>>>>>>> neither, as I know). Last stable kernel was 4.9.13-22.
>>>>
>>>> I think I have nailed down the faulty combo.
>>>> My tests showed, that SLUB allocator does not work well in Xen Dom0, on
>>>> top of Xen Hypervisor.
>>>> Id does not work at least on one of my testing servers (old AMD K8 (1
>>>> proc,
>>>> 1 core), only 1 paravirt guest).
>>>> If kernel with SLUB booted as main (w/o Xen hypervisor), it works well.
>>>> If booted as Xen hypervisor module - it almost instantly gets page
>>>> allocation failure.
>>>>
>>>>
>>>> SLAB=>SLUB was changed in kernel config, starting from 4.9.25. Then
>>>> problems
>>>> started to explode in my production environment, and on testing server
>>>> mentioned
>>>> above.
>>>>
>>>> After recompiling recent 4.9.34 with SLAB - everything works well on
>>>> that testing machine.
>>>> A will try to test 4.9.38 with the same config on my production servers.
>>>
>>> I was having page allocation failures on 4.9.25 with SLUB, but these
>>> problems seem to be gone with 4.9.34 (still with SLUB).   Have you
>>> checked this build?  It was moved to the stable repo on July 4th.
>>
>> Yes, 4.9.34 was failing too. And this was actually the worst case, with
>> I/O error on guest:
> 
> I did find one server running 4.9.34 that was still throwing SLUB page
> allocation errors, but oddly, the only servers ever to have this issue
> for me are spares that are running no domains.  I've just tried booting
> that box up on 4.9.39, but I may not know if the switch back to SLAB
> fixes anything for several weeks.
> 
> Otherwise, the other server I'm running 4.9.39 on for the past 72 hours
> has been stable with running domains.
> 

Cool,

We have several good reports .. I'll wait until Wednesday and push this
kernel to "release" if we don't get any bad reports.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170724/b29257dd/attachment-0006.sig>