On 09/04/2017 05:27 PM, Johnny Hughes wrote: > On 09/04/2017 03:59 PM, Kevin Stange wrote: >> On 09/02/2017 08:11 AM, Johnny Hughes wrote: >>> On 09/01/2017 02:41 PM, Kevin Stange wrote: >>>> On 08/31/2017 07:50 AM, PJ Welsh wrote: >>>>> A recently created and fully functional CentOS 7.3 VM fails to boot >>>>> after applying CR updates: >>>> <snip> >>>>> Server OS is CentOS 7.3 using Xen (no CR updates): >>>>> rpm -qa xen\* >>>>> xen-hypervisor-4.6.3-15.el7.x86_64 >>>>> xen-4.6.3-15.el7.x86_64 >>>>> xen-licenses-4.6.3-15.el7.x86_64 >>>>> xen-libs-4.6.3-15.el7.x86_64 >>>>> xen-runtime-4.6.3-15.el7.x86_64 >>>>> >>>>> uname -a >>>>> Linux tsxen2.xx.com <http://tsxen2.xx.com> 4.9.39-29.el7.x86_64 #1 SMP >>>>> Fri Jul 21 15:09:00 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux >>>>> >>>>> Sadly, the other issue is that the grub menu will not display for me to >>>>> select another kernel to see if it is just a kernel issue. >>>>> >>>>> The dracut prompt does not show any /dev/disk folder either. >>>>> >>>> >>>> I'm seeing this as well. My host is 4.9.44-29 and Xen 4.4.4-26 from >>>> testing repo, my guest is 3.10.0-693.1.1. Guest boots fine with >>>> 514.26.2. The kernel messages that appear to kick off the failure for >>>> me start with a page allocation failure. It eventually reaches dracut >>>> failures due to systemd/udev not setting up properly, but I think the >>>> root is this: >>>> >>>> [ 1.970630] ------------[ cut here ]------------ >>>> [ 1.970651] WARNING: CPU: 2 PID: 225 at mm/vmalloc.c:131 >>>> vmap_page_range_noflush+0x2c1/0x350 >>>> [ 1.970660] Modules linked in: >>>> [ 1.970668] CPU: 2 PID: 225 Comm: systemd-udevd Not tainted >>>> 3.10.0-693.1.1.el7.x86_64 #1 >>>> [ 1.970677] 0000000000000000 000000008cddc75d ffff8803e8587bd8 >>>> ffffffff816a3d91 >>>> [ 1.970688] ffff8803e8587c18 ffffffff810879c8 00000083811c14e8 >>>> ffff8800066eb000 >>>> [ 1.970698] 0000000000000001 ffff8803e86d6940 ffffffffc0000000 >>>> 0000000000000000 >>>> [ 1.970708] Call Trace: >>>> [ 1.970725] [<ffffffff816a3d91>] dump_stack+0x19/0x1b >>>> [ 1.970736] [<ffffffff810879c8>] __warn+0xd8/0x100 >>>> [ 1.970742] [<ffffffff81087b0d>] warn_slowpath_null+0x1d/0x20 >>>> [ 1.970748] [<ffffffff811c0781>] vmap_page_range_noflush+0x2c1/0x350 >>>> [ 1.970758] [<ffffffff811c083e>] map_vm_area+0x2e/0x40 >>>> [ 1.970765] [<ffffffff811c1590>] __vmalloc_node_range+0x170/0x270 >>>> [ 1.970774] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>> [ 1.970781] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>> [ 1.970792] [<ffffffff8105f143>] module_alloc+0x73/0xd0 >>>> [ 1.970798] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>> [ 1.970804] [<ffffffff810fe754>] module_alloc_update_bounds+0x14/0x70 >>>> [ 1.970811] [<ffffffff810ff2d2>] load_module+0xb02/0x29e0 >>>> [ 1.970817] [<ffffffff811c0717>] ? vmap_page_range_noflush+0x257/0x350 >>>> [ 1.970823] [<ffffffff811c083e>] ? map_vm_area+0x2e/0x40 >>>> [ 1.970829] [<ffffffff811c1590>] ? __vmalloc_node_range+0x170/0x270 >>>> [ 1.970838] [<ffffffff81101249>] ? SyS_init_module+0x99/0x110 >>>> [ 1.970846] [<ffffffff81101275>] SyS_init_module+0xc5/0x110 >>>> [ 1.970856] [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b >>>> [ 1.970862] ---[ end trace 2117480876ed90d2 ]--- >>>> [ 1.970869] vmalloc: allocation failure, allocated 24576 of 28672 bytes >>>> [ 1.970874] systemd-udevd: page allocation failure: order:0, mode:0xd2 >>>> [ 1.970883] CPU: 2 PID: 225 Comm: systemd-udevd Tainted: G W >>>> ------------ 3.10.0-693.1.1.el7.x86_64 #1 >>>> [ 1.970894] 00000000000000d2 000000008cddc75d ffff8803e8587c48 >>>> ffffffff816a3d91 >>>> [ 1.970910] ffff8803e8587cd8 ffffffff81188810 ffffffff8190ea38 >>>> ffff8803e8587c68 >>>> [ 1.970923] ffffffff00000018 ffff8803e8587ce8 ffff8803e8587c88 >>>> 000000008cddc75d >>>> [ 1.970939] Call Trace: >>>> [ 1.970946] [<ffffffff816a3d91>] dump_stack+0x19/0x1b >>>> [ 1.970961] [<ffffffff81188810>] warn_alloc_failed+0x110/0x180 >>>> [ 1.970971] [<ffffffff811c1654>] __vmalloc_node_range+0x234/0x270 >>>> [ 1.970981] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>> [ 1.970989] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>> [ 1.970999] [<ffffffff8105f143>] module_alloc+0x73/0xd0 >>>> [ 1.971031] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>> [ 1.971038] [<ffffffff810fe754>] module_alloc_update_bounds+0x14/0x70 >>>> [ 1.971046] [<ffffffff810ff2d2>] load_module+0xb02/0x29e0 >>>> [ 1.971052] [<ffffffff811c0717>] ? vmap_page_range_noflush+0x257/0x350 >>>> [ 1.971061] [<ffffffff811c083e>] ? map_vm_area+0x2e/0x40 >>>> [ 1.971067] [<ffffffff811c1590>] ? __vmalloc_node_range+0x170/0x270 >>>> [ 1.971075] [<ffffffff81101249>] ? SyS_init_module+0x99/0x110 >>>> [ 1.971081] [<ffffffff81101275>] SyS_init_module+0xc5/0x110 >>>> [ 1.971088] [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b >>>> [ 1.971094] Mem-Info: >>>> [ 1.971103] active_anon:875 inactive_anon:2049 isolated_anon:0 >>>> [ 1.971103] active_file:791 inactive_file:8841 isolated_file:0 >>>> [ 1.971103] unevictable:0 dirty:0 writeback:0 unstable:0 >>>> [ 1.971103] slab_reclaimable:1732 slab_unreclaimable:1629 >>>> [ 1.971103] mapped:1464 shmem:2053 pagetables:480 bounce:0 >>>> [ 1.971103] free:4065966 free_pcp:763 free_cma:0 >>>> [ 1.971131] Node 0 DMA free:15912kB min:12kB low:12kB high:16kB >>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB >>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15996kB >>>> managed:15912kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB >>>> slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB >>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB >>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no >>>> [ 1.971217] lowmem_reserve[]: 0 4063 16028 16028 >>>> [ 1.971226] Node 0 DMA32 free:4156584kB min:4104kB low:5128kB >>>> high:6156kB active_anon:952kB inactive_anon:1924kB active_file:0kB >>>> inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB >>>> present:4177920kB managed:4162956kB mlocked:0kB dirty:0kB writeback:0kB >>>> mapped:4kB shmem:1928kB slab_reclaimable:240kB slab_unreclaimable:504kB >>>> kernel_stack:32kB pagetables:592kB unstable:0kB bounce:0kB >>>> free_pcp:1760kB local_pcp:288kB free_cma:0kB writeback_tmp:0kB >>>> pages_scanned:0 all_unreclaimable? no >>>> [ 1.971264] lowmem_reserve[]: 0 0 11964 11964 >>>> [ 1.971273] Node 0 Normal free:12091564kB min:12088kB low:15108kB >>>> high:18132kB active_anon:2352kB inactive_anon:6272kB active_file:3164kB >>>> inactive_file:35364kB unevictable:0kB isolated(anon):0kB >>>> isolated(file):0kB present:12591104kB managed:12251788kB mlocked:0kB >>>> dirty:0kB writeback:0kB mapped:5852kB shmem:6284kB >>>> slab_reclaimable:6688kB slab_unreclaimable:6012kB kernel_stack:880kB >>>> pagetables:1328kB unstable:0kB bounce:0kB free_pcp:1196kB >>>> local_pcp:152kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 >>>> all_unreclaimable? no >>>> [ 1.971309] lowmem_reserve[]: 0 0 0 0 >>>> [ 1.971316] Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 1*32kB (U) 2*64kB (U) >>>> 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = >>>> 15912kB >>>> [ 1.971343] Node 0 DMA32: 7*4kB (M) 18*8kB (UM) 7*16kB (EM) 3*32kB >>>> (EM) 1*64kB (E) 2*128kB (UM) 1*256kB (E) 4*512kB (UM) 4*1024kB (UEM) >>>> 4*2048kB (EM) 1011*4096kB (M) = 4156348kB >>>> [ 1.971377] Node 0 Normal: 64*4kB (UEM) 10*8kB (UEM) 6*16kB (EM) >>>> 3*32kB (EM) 3*64kB (UE) 3*128kB (UEM) 1*256kB (E) 2*512kB (UE) 0*1024kB >>>> 1*2048kB (M) 2951*4096kB (M) = 12091728kB >>>> [ 1.971413] Node 0 hugepages_total=0 hugepages_free=0 >>>> hugepages_surp=0 hugepages_size=2048kB >>>> [ 1.971425] 11685 total pagecache pages >>>> [ 1.971430] 0 pages in swap cache >>>> [ 1.971437] Swap cache stats: add 0, delete 0, find 0/0 >>>> [ 1.971444] Free swap = 0kB >>>> [ 1.971451] Total swap = 0kB >>>> [ 1.971456] 4196255 pages RAM >>>> [ 1.971462] 0 pages HighMem/MovableOnly >>>> [ 1.971467] 88591 pages reserved >>>> >>> >>> Do any of you guys have access to RHEL to try the RHEL 7.4 Kernel? >> >> I think I may. I haven't tried yet, but I'll see if I can get my hands >> on one and test it tomorrow when I'm back at the office tomorrow. >> >> RH closed my bug as "WONTFIX" so far, saying Red Hat Quality Engineering >> Management declined the request. I started to look at the Red Hat >> source browser to see the list of patches from 693 to 514, but getting >> the full list seems impossible because the change log only goes back to >> 644 and there doesn't seem to be a way to obtain full builds of >> unreleased kernels. Unless I'm mistaken. >> >> I will also do some digging via RH support if I can. >> > I would think that RH would want AWS support for RHEL 7.4 and I thought > AWS was run on Xen // Note: I could be wrong about that. > > In any event, at the very least, we can make a kernel that boots PV for > 7.4 at some point. AWS does run on Xen, but the modifications they make to Xen are not known to me nor which version of Xen they use. They may also run the domains as HVM, which seems to mitigate the issue here. I just verified this kernel issue exists on a RHEL 7.3 system image under the same conditions, when it's updated to RHEL 7.4 and kernel 3.10.0-693.2.1.el7.x86_64. -- Kevin Stange Chief Technology Officer Steadfast | Managed Infrastructure, Datacenter and Cloud Services 800 S Wells, Suite 190 | Chicago, IL 60607 312.602.2689 X203 | Fax: 312.602.2688 kevin at steadfast.net | www.steadfast.net