On 07/24/2017 03:05 PM, Kevin Stange wrote: > On 07/20/2017 03:14 PM, Piotr Gackiewicz wrote: >> On Thu, 20 Jul 2017, Kevin Stange wrote: >> >>> On 07/20/2017 05:31 AM, Piotr Gackiewicz wrote: >>>> On Wed, 19 Jul 2017, Johnny Hughes wrote: >>>> >>>>> On 07/19/2017 09:23 AM, Johnny Hughes wrote: >>>>>> On 07/19/2017 04:27 AM, Piotr Gackiewicz wrote: >>>>>>> On Mon, 17 Jul 2017, Johnny Hughes wrote: >>>>>>> >>>>>>>> Are the testing kernels (kernel-4.9.37-29.el7 and >>>>>>>> kernel-4.9.37-29.el6, >>>>>>>> with the one config file change) working for everyone: >>>>>>>> >>>>>>>> (turn off: CONFIG_IO_STRICT_DEVMEM) >>>>>>> >>>>>>> Hello. >>>>>>> Maybe it's not the most appropriate thread or time, but I have been >>>>>>> signalling it before: >>>>>>> >>>>>>> 4.9.* kernels do not work well for me any more (and for other people >>>>>>> neither, as I know). Last stable kernel was 4.9.13-22. >>>> >>>> I think I have nailed down the faulty combo. >>>> My tests showed, that SLUB allocator does not work well in Xen Dom0, on >>>> top of Xen Hypervisor. >>>> Id does not work at least on one of my testing servers (old AMD K8 (1 >>>> proc, >>>> 1 core), only 1 paravirt guest). >>>> If kernel with SLUB booted as main (w/o Xen hypervisor), it works well. >>>> If booted as Xen hypervisor module - it almost instantly gets page >>>> allocation failure. >>>> >>>> >>>> SLAB=>SLUB was changed in kernel config, starting from 4.9.25. Then >>>> problems >>>> started to explode in my production environment, and on testing server >>>> mentioned >>>> above. >>>> >>>> After recompiling recent 4.9.34 with SLAB - everything works well on >>>> that testing machine. >>>> A will try to test 4.9.38 with the same config on my production servers. >>> >>> I was having page allocation failures on 4.9.25 with SLUB, but these >>> problems seem to be gone with 4.9.34 (still with SLUB). Have you >>> checked this build? It was moved to the stable repo on July 4th. >> >> Yes, 4.9.34 was failing too. And this was actually the worst case, with >> I/O error on guest: > > I did find one server running 4.9.34 that was still throwing SLUB page > allocation errors, but oddly, the only servers ever to have this issue > for me are spares that are running no domains. I've just tried booting > that box up on 4.9.39, but I may not know if the switch back to SLAB > fixes anything for several weeks. > > Otherwise, the other server I'm running 4.9.39 on for the past 72 hours > has been stable with running domains. > Cool, We have several good reports .. I'll wait until Wednesday and push this kernel to "release" if we don't get any bad reports. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170724/b29257dd/attachment-0006.sig>