[CentOS-virt] Crash in CentOS 7 kernel-3.10.0-514.16.1.el7.x86_64 in Xen PV mode

Mon Oct 23 19:57:45 UTC 2017
Karl Johnson <karljohnson.it at gmail.com>

On Sat, May 20, 2017 at 8:30 PM, Sarah Newman <srn at prgmr.com> wrote:

> I experienced a bug that is likely the same as https://bugs.launchpad.net/
> ubuntu/+source/linux/+bug/1350373 . Commit
> b7dd0e350e0bd4c0fddcc9b8958342700b00b168 , which is supposed to fix it,
> doesn't appear in this kernel and doesn't apply cleanly either.
> Is there any point in trying to backport the patch?
>
> The backtrace is as follows:
>
> [   32.304666] ------------[ cut here ]------------
> [   32.304679] kernel BUG at arch/x86/kernel/paravirt.c:252!
> [   32.304683] invalid opcode: 0000 [#1] SMP
> [   32.304687] Modules linked in: ip6t_rpfilter ipt_REJECT nf_reject_ipv4
> ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat
> ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6
> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
> iptable_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
> iptable_mangle iptable_security iptable_raw ebtable_filter ebtables
> ip6table_filter
> ip6_tables iptable_filter intel_powerclamp coretemp pcspkr ip_tables ext4
> mbcache jbd2 xen_netfront xen_blkfront crc32c_intel
> [   32.304734] CPU: 0 PID: 3901 Comm: dracut Not tainted
> 3.10.0-514.16.1.el7.x86_64 #1
> [   32.304739] task: ffff880002598000 ti: ffff88001b728000 task.ti:
> ffff88001b728000
> [   32.304743] RIP: e030:[<ffffffff8167eb81>]  [<ffffffff8167eb81>]
> enter_lazy.part.0+0x4/0x6
> [   32.304755] RSP: e02b:ffff88001f803aa8  EFLAGS: 00010002
> [   32.304758] RAX: 0000000000000001 RBX: ffff88001eacd640 RCX:
> 00003ffffffff000
> [   32.304761] RDX: ffff880000000640 RSI: ffffc900000c8000 RDI:
> 0000000000000001
> [   32.304765] RBP: ffff88001f803aa8 R08: ffff88001f803b78 R09:
> ffffffff813d50f9
> [   32.304771] R10: ffff88001e801e00 R11: ffffea0000093dc0 R12:
> ffffc900000c9000
> [   32.304777] R13: ffffc900000c8000 R14: 0000000000000000 R15:
> ffff88001d150340
> [   32.304787] FS:  00007f64425b0740(0000) GS:ffff88001f800000(0000)
> knlGS:0000000000000000
> [   32.304796] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   32.304801] CR2: 00000000006de2c8 CR3: 000000001b405000 CR4:
> 0000000000002660
> [   32.304807] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [   32.304813] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [   32.304818] Stack:
> [   32.304823]  ffff88001f803ab8 ffffffff81061857 ffff88001f803b60
> ffffffff811b1fe2
> [   32.304833]  ffffc900000c8fff ffffc900000c9000 ffffffff819bac90
> ffffc900000c8fff
> [   32.304843]  ffffc900000c9000 ffff88001eacb000 ffffffff810206b0
> 0000000000000000
> [   32.304854] Call Trace:
> [   32.304858]  <IRQ>
> [   32.304861]  [<ffffffff81061857>] paravirt_enter_lazy_mmu+0x27/0x30
> [   32.304879]  [<ffffffff811b1fe2>] apply_to_page_range+0x282/0x460
> [   32.304888]  [<ffffffff810206b0>] ? map_pte_fn+0x60/0x60
> [   32.304894]  [<ffffffff810207fb>] arch_gnttab_map_status+0x3b/0x70
> [   32.304904]  [<ffffffff813d5176>] gnttab_map_frames_v2+0xd6/0x150
> [   32.304910]  [<ffffffff813d5291>] gnttab_map+0xa1/0x140
> [   32.304917]  [<ffffffff813d5430>] get_free_entries+0x100/0x2e0
> [   32.304923]  [<ffffffff813d56d5>] gnttab_alloc_grant_references+
> 0x15/0x30
> [   32.304933]  [<ffffffffa000bd4f>] do_blkif_request+0x6bf/0x8a0
> [xen_blkfront]
> [   32.304945]  [<ffffffff812eb0e2>] ? __freed_request+0x92/0xa0
> [   32.304951]  [<ffffffff812eb6e3>] __blk_run_queue+0x33/0x40
> [   32.304957]  [<ffffffff812eb719>] blk_start_queue+0x29/0x40
> [   32.304964]  [<ffffffffa000bf51>] kick_pending_request_queues+0x21/0x30
> [xen_blkfront]
> [   32.304975]  [<ffffffffa000c6ce>] blkif_interrupt+0x76e/0x820
> [xen_blkfront]
> [   32.304986]  [<ffffffff811dcc8b>] ? kmem_cache_free+0x1bb/0x1f0
> [   32.304995]  [<ffffffff8113079e>] handle_irq_event_percpu+0x3e/0x1e0
> [   32.305003]  [<ffffffff8113097d>] handle_irq_event+0x3d/0x60
> [   32.305004]  [<ffffffff81133647>] handle_edge_irq+0x77/0x130
> [   32.305004]  [<ffffffff813d6217>] __xen_evtchn_do_upcall+0x227/0x350
> [   32.305004]  [<ffffffff813d83c3>] xen_evtchn_do_upcall+0x33/0x50
> [   32.305004]  [<ffffffff81698c7e>] xen_do_hypervisor_callback+0x1e/0x30
> [   32.305004]  <EOI>
> [   32.305004]  [<ffffffff811af916>] ? copy_pte_range+0x2b6/0x5a0
> [   32.305004]  [<ffffffff811af8e6>] ? copy_pte_range+0x286/0x5a0
> [   32.305004]  [<ffffffff811b24d2>] ? copy_page_range+0x312/0x490
> [   32.305004]  [<ffffffff81083012>] ? dup_mm+0x362/0x680
> [   32.305004]  [<ffffffff810847ae>] ? copy_process+0x144e/0x1960
> [   32.305004]  [<ffffffff81084e71>] ? do_fork+0x91/0x2c0
> [   32.305004]  [<ffffffff81085126>] ? SyS_clone+0x16/0x20
> [   32.305004]  [<ffffffff816974d9>] ? stub_clone+0x69/0x90
> [   32.305004]  [<ffffffff81697189>] ? system_call_fastpath+0x16/0x1b
> [   32.305004] Code: 20 e9 2f ff ff ff 44 89 fa 44 89 ee 48 c7 c7 10 45 8c
> 81 31 c0 e8 9d 14 00 00 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d c3 55 48 89 e5
> <0f> 0b 66 66 66 66 90 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48
> [   32.305004] RIP  [<ffffffff8167eb81>] enter_lazy.part.0+0x4/0x6
> [   32.305004]  RSP <ffff88001f803aa8>
> [   32.305004] ---[ end trace 49f67f0d85e1ac69 ]---
> [   32.305004] Kernel panic - not syncing: Fatal exception in interrupt
>
>
I had the same kernel panic while booting a PV domU on
3.10.0-693.2.2.el7.centos.plus.x86_64. I had to start the domU again to
boot correctly. Can this patch be added to the CentOS 7 kernel-plus?

[ 13.372417] ------------[ cut here ]------------
[ 13.372434] kernel BUG at arch/x86/kernel/paravirt.c:252!
[ 13.372441] invalid opcode: 0000 [#1] SMP
[ 13.372450] Modules linked in: xt_owner nf_nat_ftp xt_REDIRECT
nf_nat_redirect xt_conntrack iptable_mangle nf_conntrack_ftp xt_LOG
xt_limit xt_multiport iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 iptable_filter
vfat fat isofs xfs libcrc32c loop sb_edac edac_core coretemp intel_rapl
iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul
glue_helper ablk_helper cryptd pcspkr ip_tables ext4 mbcache jbd2
xen_blkfront xen_netfront crct10dif_pclmul crct10dif_common crc32c_intel
[ 13.372545] CPU: 0 PID: 1138 Comm: mysqld Not tainted
3.10.0-693.2.2.el7.centos.plus.x86_64 #1
[ 13.372555] task: ffff8800fb1a8fd0 ti: ffff8801e9e20000 task.ti:
ffff8801e9e20000
[ 13.372561] RIP: e030:[<ffffffff816ad7fe>] [<ffffffff816ad7fe>]
enter_lazy.part.0+0x4/0x6
[ 13.372579] RSP: e02b:ffff8801fea03a80 EFLAGS: 00010002
[ 13.372584] RAX: 0000000000000001 RBX: ffff88017d05b280 RCX:
ffffffff810215a0
[ 13.372593] RDX: ffff880000000280 RSI: 00003ffffffff000 RDI:
0000000000000001
[ 13.372599] RBP: ffff8801fea03a80 R08: ffff8801fea03b50 R09:
ffffffff813f6559
[ 13.372605] R10: ffff88017fc01d00 R11: ffffea00000c0380 R12:
ffffc90000c52000
[ 13.372616] R13: ffffc90000c50000 R14: 0000000000000000 R15:
ffff8801e3770e00
[ 13.372632] FS: 00007febcc67a900(0000) GS:ffff8801fea00000(0000)
knlGS:ffff8801fea00000
[ 13.372644] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 13.372654] CR2: 000056038881bd60 CR3: 00000000f996d000 CR4:
0000000000042660
[ 13.372663] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 13.372672] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 13.372680] Stack:
[ 13.372685] ffff8801fea03a90 ffffffff81063b05 ffff8801fea03b38
ffffffff811b2a80
[ 13.372701] ffffc90000c51fff ffffc90000c52000 ffffffff81a0ac90
ffffc90000c51fff
[ 13.372718] ffffc90000c52000 ffff88017fc50000 ffffffff810215a0
00000000fead6d00
[ 13.372733] Call Trace:
[ 13.372740] <IRQ>
[ 13.372744] [<ffffffff81063b05>] paravirt_enter_lazy_mmu+0x25/0x30
[ 13.372771] [<ffffffff811b2a80>] apply_to_page_range+0x260/0x430
[ 13.372784] [<ffffffff810215a0>] ? map_pte_fn+0x60/0x60
[ 13.372794] [<ffffffff810216eb>] arch_gnttab_map_status+0x3b/0x70
[ 13.372809] [<ffffffff813f65d6>] gnttab_map_frames_v2+0xd6/0x150
[ 13.372820] [<ffffffff813f66f1>] gnttab_map+0xa1/0x140
[ 13.372831] [<ffffffff813f6890>] get_free_entries+0x100/0x2e0

Karl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20171023/ff4b8d03/attachment-0003.html>