[CentOS-virt] Crash in CentOS 7 kernel-3.10.0-514.16.1.el7.x86_64 in Xen PV mode

Tue Oct 24 19:53:25 UTC 2017
Karl Johnson <karljohnson.it at gmail.com>

On Tue, Oct 24, 2017 at 3:09 PM, Karl Johnson <karljohnson.it at gmail.com>
wrote:

> On Tue, Oct 24, 2017 at 3:36 AM, Akemi Yagi <amyagi at gmail.com> wrote:
>
>> On Mon, Oct 23, 2017 at 11:08 PM, Akemi Yagi <amyagi at gmail.com> wrote:
>>
>>> On Mon, Oct 23, 2017 at 12:57 PM, Karl Johnson <karljohnson.it at gmail.com
>>> > wrote:
>>>
>>>> On Sat, May 20, 2017 at 8:30 PM, Sarah Newman <srn at prgmr.com> wrote:
>>>>
>>>>> I experienced a bug that is likely the same as
>>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1350373 . Commit
>>>>> b7dd0e350e0bd4c0fddcc9b8958342700b00b168 , which is supposed to fix
>>>>> it, doesn't appear in this kernel and doesn't apply cleanly either.
>>>>> Is there any point in trying to backport the patch?
>>>>>
>>>>> I had the same kernel panic while booting a PV domU on
>>>> 3.10.0-693.2.2.el7.centos.plus.x86_64. I had to start the domU again
>>>> to boot correctly. Can this patch be added to the CentOS 7 kernel-plus?
>>>>
>>>> Karl
>>>>
>>>
>>> ​I can certainly add the patch (commit b7dd0e350e0bd4c0fddcc9b8958342700b00b168)
>>> to the Plus kernel.​ It would be best if you could file a request on
>>> http://bugs.centos.org so that we can track it better.
>>>
>>> Akemi
>>>
>>
>> ​A CentOSPlus kernel ​set with the referenced patch applied is available
>> for testing at:
>>
>> https://people.centos.org/toracat/kernel/7/plus/xen/
>>
>> Feedback appreciated,
>>
>> Akemi
>>
>
> Thanks for the build Akemi. I will try to test this kernel in the next
> days however it will be hard to know if it fix the kernel panic because I
> can't reproduce it. It's seems to be random and pretty rare in my case.
>

The test kernel doesn't boot on my side:

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.10.0-693.5.2.el7.centos.plus.1.x86_64
(yagi2 at h64r7) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP
Mon Oct 23 22:30:37 PDT 2017
[    0.000000] Command line: console=hvc0 xencons=tty0 root=/dev/xvda1 ro
LANG=en_CA.UTF-8 elevator=noop nohz=off
[    0.000000] ACPI in unprivileged domain disabled
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x000000003fffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] DMI not present or invalid.
[    0.000000] e820: last_pfn = 0x40000 max_arch_pfn = 0x400000000
[    0.000000] RAMDISK: [mem 0x0242d000-0x038e0fff]
[    0.000000] NUMA turned off
[    0.000000] Faking a node at [mem 0x0000000000000000-0x000000003fffffff]
[    0.000000] NODE_DATA(0) allocated [mem 0x3fe03000-0x3fe29fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x00001000-0x0009ffff]
[    0.000000]   node   0: [mem 0x00100000-0x3fffffff]
[    0.000000] Initmem setup node 0 [mem 0x00001000-0x3fffffff]
[    0.000000] SFI: Simple Firmware Interface v0.81
http://simplefirmware.org
[    0.000000] No local APIC present
[    0.000000] APIC: disable apic facility
[    0.000000] APIC: switched to apic NOOP
[    0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
[    0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000fffff]
[    0.000000] e820: [mem 0x40000000-0xffffffff] available for PCI devices
[    0.000000] Booting paravirtualized kernel on Xen
[    0.000000] Xen version: 4.6.3-3.el6 (preserve-AD)
[    0.000000] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:2 nr_cpu_ids:2
nr_node_ids:1
[    0.000000] PERCPU: Embedded 33 pages/cpu @ffff88003f800000 s97112 r8192
d29864 u1048576
[    0.000000] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes)
[    0.000000] Built 1 zonelists in Node order, mobility grouping on.
Total pages: 257930
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: console=hvc0 xencons=tty0
root=/dev/xvda1 ro LANG=en_CA.UTF-8 elevator=noop nohz=off
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100
[    0.000000] xsave: enabled xstate_bv 0x7, cntxt size 0x340 using
standard form
[    0.000000] Memory: 989236k/1048576k available (6954k kernel code, 388k
absent, 58952k reserved, 4575k data, 1768k init)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000] Hierarchical RCU implementation.
[    0.000000]     RCU restricting CPUs from NR_CPUS=5120 to nr_cpu_ids=2.
[    0.000000] NR_IRQS:327936 nr_irqs:32 0
[    0.000000] Console: colour dummy device 80x25
[    0.000000] console [tty0] enabled
[    0.000000] console [hvc0] enabled
[    0.000000] allocated 4194304 bytes of page_cgroup
[    0.000000] please try 'cgroup_disable=memory' option if you don't want
memory cgroups
[    0.000000] installing Xen timer for CPU 0
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2100.066 MHz processor
[    0.002000] Calibrating delay loop (skipped), value calculated using
timer frequency.. 4200.06 BogoMIPS (lpj=2100030)
[    0.002000] pid_max: default: 32768 minimum: 301
[    0.002000] Security Framework initialized
[    0.002000] SELinux:  Initializing.
[    0.002000] Yama: becoming mindful.
[    0.002000] Dentry cache hash table entries: 131072 (order: 8, 1048576
bytes)
[    0.002000] Inode-cache hash table entries: 65536 (order: 7, 524288
bytes)
[    0.002000] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes)
[    0.002000] Mountpoint-cache hash table entries: 2048 (order: 2, 16384
bytes)
[    0.002086] Initializing cgroup subsys memory
[    0.002104] Initializing cgroup subsys devices
[    0.002111] Initializing cgroup subsys freezer
[    0.002116] Initializing cgroup subsys net_cls
[    0.002122] Initializing cgroup subsys blkio
[    0.002127] Initializing cgroup subsys perf_event
[    0.002133] Initializing cgroup subsys hugetlb
[    0.002138] Initializing cgroup subsys pids
[    0.002143] Initializing cgroup subsys net_prio
[    0.002207] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    0.002214] ENERGY_PERF_BIAS: View and update with
x86_energy_perf_policy(8)
[    0.002221] CPU: Physical Processor ID: 0
[    0.002225] CPU: Processor Core ID: 0
[    0.003093] Last level iTLB entries: 4KB 512, 2MB 0, 4MB 0
[    0.003098] Last level dTLB entries: 4KB 512, 2MB 0, 4MB 0
[    0.003103] tlb_flushall_shift: 6
[    0.036643] ftrace: allocating 26819 entries in 105 pages
[    0.043078] cpu 0 spinlock event irq 17
[    0.043086] smpboot: Max logical packages: 1
[    0.043118] Performance Events: unsupported p6 CPU model 62 no PMU
driver, software events only.
[    0.044508] NMI watchdog: disabled (cpu0): hardware events not enabled
[    0.044515] NMI watchdog: Shutting down hard lockup detector on all cpus
[    0.044598] installing Xen timer for CPU 1
[    0.044613] cpu 1 spinlock event irq 24
[    0.044678] SMP alternatives: switching to SMP code
[    0.002000] [Firmware Bug]: CPU1: APIC id mismatch. Firmware: ffff APIC:
6
[    0.072708] Brought up 2 CPUs
[    0.073046] devtmpfs: initialized
[    0.075736] EVM: security.selinux
[    0.075742] EVM: security.ima
[    0.075746] EVM: security.capability
[    0.076705] atomic64 test passed for x86-64 platform with CX8 and with
SSE
[    0.076714] pinctrl core: initialized pinctrl subsystem
[    0.076763] xen:grant_table: Grant tables using version 2 layout
[    0.076775] BUG: unable to handle kernel NULL pointer dereference at
0000000000000010
[    0.076786] IP: [<ffffffff813f6d0f>] gnttab_init+0xff/0x260
[    0.076796] PGD 0
[    0.076802] Oops: 0002 [#1] SMP
[    0.076808] Modules linked in:
[    0.076817] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
3.10.0-693.5.2.el7.centos.plus.1.x86_64 #1
[    0.076825] task: ffff88003da38000 ti: ffff88003daa0000 task.ti:
ffff88003daa0000
[    0.076831] RIP: e030:[<ffffffff813f6d0f>]  [<ffffffff813f6d0f>]
gnttab_init+0xff/0x260
[    0.076840] RSP: e02b:ffff88003daa3df8  EFLAGS: 00010286
[    0.076844] RAX: ffff88003d405000 RBX: 0000000000000000 RCX:
000000000001a210
[    0.076849] RDX: 0000000000000000 RSI: 000000000000001e RDI:
0000000000000000
[    0.076854] RBP: ffff88003daa3e40 R08: 0000000000000000 R09:
000000000001a1b0
[    0.076859] R10: ffff88003fe03800 R11: 0000000000000001 R12:
0000000000000000
[    0.077000] R13: 0000000000000001 R14: 0000000000000010 R15:
0000000000000000
[    0.077000] FS:  0000000000000000(0000) GS:ffff88003f800000(0000)
knlGS:0000000000000000
[    0.077000] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.077000] CR2: 0000000000000010 CR3: 0000000001a0a000 CR4:
0000000000042660
[    0.077000] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    0.077000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[    0.077000] Stack:
[    0.077000]  0000000000000000 0000000400007ff0 ffff000000000020
000000001fffa6db
[    0.077000]  ffffffff81a11020 ffff88003e002a70 ffffffff813f6e70
0000000000000000
[    0.077000]  0000000000000000 ffff88003daa3e50 ffffffff813f6e93
ffff88003daa3e80
[    0.077000] Call Trace:
[    0.077000]  [<ffffffff813f6e70>] ? gnttab_init+0x260/0x260
[    0.077000]  [<ffffffff813f6e93>] __gnttab_init+0x23/0x40
[    0.077000]  [<ffffffff810020e8>] do_one_initcall+0xb8/0x230
[    0.077000]  [<ffffffff81b5d1fb>] kernel_init_freeable+0x17a/0x219
[    0.077000]  [<ffffffff81b5c9d4>] ? initcall_blacklist+0xb0/0xb0
[    0.077000]  [<ffffffff816a3d20>] ? rest_init+0x80/0x80
[    0.077000]  [<ffffffff816a3d2e>] kernel_init+0xe/0xf0
[    0.077000]  [<ffffffff816c5f98>] ret_from_fork+0x58/0x90
[    0.077000]  [<ffffffff816a3d20>] ? rest_init+0x80/0x80
[    0.077000] Code: 00 00 66 2e 0f 1f 84 00 00 00 00 00 83 c3 01 41 39 dd
0f 86 84 00 00 00 4c 63 e3 31 f6 bf d0 00 00 00 4e 8d 34 e0 e8 01 09 d9 ff
<49> 89 06 48 8b 05 37 0d bf 00 4a 83 3c e0 00 75 d0 48 89 c7 41
[    0.077000] RIP  [<ffffffff813f6d0f>] gnttab_init+0xff/0x260
[    0.077000]  RSP <ffff88003daa3df8>
[    0.077000] CR2: 0000000000000010
[    0.077000] ---[ end trace ad7a936cdeb5166e ]---
[    0.077000] Kernel panic - not syncing: Fatal exception

I switched back to 3.10.0-693.2.2.el7.centos.plus.x86_64.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20171024/a4478abc/attachment-0006.html>