[CentOS-virt] kernel-4.9.37-29.el7 (and el6)

Wed Jul 19 14:23:19 UTC 2017
Johnny Hughes <johnny at centos.org>

On 07/19/2017 04:27 AM, Piotr Gackiewicz wrote:
> On Mon, 17 Jul 2017, Johnny Hughes wrote:
> 
>> Are the testing kernels (kernel-4.9.37-29.el7 and kernel-4.9.37-29.el6,
>> with the one config file change) working for everyone:
>>
>> (turn off: CONFIG_IO_STRICT_DEVMEM)
> 
> Hello.
> Maybe it's not the most appropriate thread or time, but I have been
> signalling it before:
> 
> 4.9.* kernels do not work well for me any more (and for other people
> neither, as I know). Last stable kernel was 4.9.13-22.
> 
> Since 4.9.25-26 I do often get:
> on 3 supermicro servers (different generations):
> - memory allocation errors on Dom0 and corresponding lost lost page writes
>     due to buffer I/O error on PV guests
> - after such memory allocation error od dom0 I have spotted also:
>     - NFS client hangups on guests (server not responding, still trying
> => server OK)
>     - iptables lockups on PV guest reboot
> 
> on 1 supermicro server:
> - memory allocation errors on Dom0 and SATA lockups (many, if not SATA
> channels at
>     - once):
>     exception Emask 0x0 SAct 0x20 SErr 0x0 action 0x6 frozen
>     hard resetting link
>     failed to IDENTIFY (I/O error, err_mask=0x4)
>     then: blk_update_request: I/O error, dev sd., sector ....
> 
> 
> All of these machines have been tested with memtest, no detected memory
> problems.
> No such things occur, when I boot 4.9.13-22
> Most of my guests are centos 6 x86_64, bridged.
> 
> Do anyone had such problems, dealt with it somehow?
> 
> 
> Since spotting these errors I have done many tests, compiled and tested to
> point out single code change (kernel version, patch) - no conclusions yet.
> 
> But one has changed much between 4.9.13 and 4.9.25: kernel size and
> configuration.
> 4.9.13 size was 6MB and 4.9.24 is 7.1MB. Many modules have been
> compiled into kernel, here is shortened, but significant list:
> - iptables (NETFILTER_XTABLES, IP_NF_FILTER, IP_NF_TARGET_REJECT)
> - SATA_AHCI
> - ATA_AHCI (PATA, what a heck?)
> - FBDEV_FRONTEND
> - HID_MAGICKMOUSE
> - HID_NTRIG
> - USB_XHCI
> - INTEL_SMARTCONNECT
> 
Modules that are not loaded are not used.  It has no impact at all on
performance or compatibility unless it is used.  If you take an lsmod of
the kernel that works and one of the kernel with issues, we can see if
there are LOADED modules that might cause issues.

The modules that are built are the same as Fedora and if in the RHEL 7
kernel, RHEL 7.

We did troubleshoot and turn off some things recently, one thing in
particular was CONFIG_IO_STRICT_DEVMEM , which is on in fedora, but
which is off in some other distros and causes issues with ISCSI and some
other things.

We also added some specific xen patches, one for netback queue, one for
apic, one for nested dom0.  Also upstream has added in several xen
patches since 4.9.13.

And yes, we did change the kernel configs specifically to add in
iptables as many people want them.

If you can point to problems with a specific module, we can discuss it
here and turn it off if necessary.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170719/6b1190c1/attachment-0004.sig>