[CentOS-virt] kernel-4.9.37-29.el7 (and el6)

Wed Jul 19 14:43:26 UTC 2017
Johnny Hughes <johnny at centos.org>

On 07/19/2017 09:23 AM, Johnny Hughes wrote:
> On 07/19/2017 04:27 AM, Piotr Gackiewicz wrote:
>> On Mon, 17 Jul 2017, Johnny Hughes wrote:
>>
>>> Are the testing kernels (kernel-4.9.37-29.el7 and kernel-4.9.37-29.el6,
>>> with the one config file change) working for everyone:
>>>
>>> (turn off: CONFIG_IO_STRICT_DEVMEM)
>>
>> Hello.
>> Maybe it's not the most appropriate thread or time, but I have been
>> signalling it before:
>>
>> 4.9.* kernels do not work well for me any more (and for other people
>> neither, as I know). Last stable kernel was 4.9.13-22.
>>
>> Since 4.9.25-26 I do often get:
>> on 3 supermicro servers (different generations):
>> - memory allocation errors on Dom0 and corresponding lost lost page writes
>>     due to buffer I/O error on PV guests
>> - after such memory allocation error od dom0 I have spotted also:
>>     - NFS client hangups on guests (server not responding, still trying
>> => server OK)
>>     - iptables lockups on PV guest reboot
>>
>> on 1 supermicro server:
>> - memory allocation errors on Dom0 and SATA lockups (many, if not SATA
>> channels at
>>     - once):
>>     exception Emask 0x0 SAct 0x20 SErr 0x0 action 0x6 frozen
>>     hard resetting link
>>     failed to IDENTIFY (I/O error, err_mask=0x4)
>>     then: blk_update_request: I/O error, dev sd., sector ....
>>
>>
>> All of these machines have been tested with memtest, no detected memory
>> problems.
>> No such things occur, when I boot 4.9.13-22
>> Most of my guests are centos 6 x86_64, bridged.
>>
>> Do anyone had such problems, dealt with it somehow?
>>
>>
>> Since spotting these errors I have done many tests, compiled and tested to
>> point out single code change (kernel version, patch) - no conclusions yet.
>>
>> But one has changed much between 4.9.13 and 4.9.25: kernel size and
>> configuration.
>> 4.9.13 size was 6MB and 4.9.24 is 7.1MB. Many modules have been
>> compiled into kernel, here is shortened, but significant list:
>> - iptables (NETFILTER_XTABLES, IP_NF_FILTER, IP_NF_TARGET_REJECT)
>> - SATA_AHCI
>> - ATA_AHCI (PATA, what a heck?)
>> - FBDEV_FRONTEND
>> - HID_MAGICKMOUSE
>> - HID_NTRIG
>> - USB_XHCI
>> - INTEL_SMARTCONNECT
>>
> Modules that are not loaded are not used.  It has no impact at all on
> performance or compatibility unless it is used.  If you take an lsmod of
> the kernel that works and one of the kernel with issues, we can see if
> there are LOADED modules that might cause issues.
> 
> The modules that are built are the same as Fedora and if in the RHEL 7
> kernel, RHEL 7.
> 
> We did troubleshoot and turn off some things recently, one thing in
> particular was CONFIG_IO_STRICT_DEVMEM , which is on in fedora, but
> which is off in some other distros and causes issues with ISCSI and some
> other things.
> 
> We also added some specific xen patches, one for netback queue, one for
> apic, one for nested dom0.  Also upstream has added in several xen
> patches since 4.9.13.

There are several very important patches in this kernel for xen (for
example):

https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.9.36


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170719/6f100d53/attachment-0005.sig>