On 14-09-2017 20:57, Adi Pircalabu wrote: > On 08-09-2017 6:17, Kevin Stange wrote: >> On 09/06/2017 05:21 PM, Kevin Stange wrote: >>> On 09/06/2017 08:40 AM, Johnny Hughes wrote: >>>> On 09/05/2017 02:26 PM, Kevin Stange wrote: >>>>> On 09/04/2017 05:27 PM, Johnny Hughes wrote: >>>>>> On 09/04/2017 03:59 PM, Kevin Stange wrote: >>>>>>> On 09/02/2017 08:11 AM, Johnny Hughes wrote: >>>>>>>> On 09/01/2017 02:41 PM, Kevin Stange wrote: >>>>>>>>> On 08/31/2017 07:50 AM, PJ Welsh wrote: >>>>>>>>>> A recently created and fully functional CentOS 7.3 VM fails to >>>>>>>>>> boot >>>>>>>>>> after applying CR updates: >>>>>>>>> <snip> >>>>>>>>>> Server OS is CentOS 7.3 using Xen (no CR updates): >>>>>>>>>> rpm -qa xen\* >>>>>>>>>> xen-hypervisor-4.6.3-15.el7.x86_64 >>>>>>>>>> xen-4.6.3-15.el7.x86_64 >>>>>>>>>> xen-licenses-4.6.3-15.el7.x86_64 >>>>>>>>>> xen-libs-4.6.3-15.el7.x86_64 >>>>>>>>>> xen-runtime-4.6.3-15.el7.x86_64 >>>>>>>>>> >>>>>>>>>> uname -a >>>>>>>>>> Linux tsxen2.xx.com <http://tsxen2.xx.com> >>>>>>>>>> 4.9.39-29.el7.x86_64 #1 SMP >>>>>>>>>> Fri Jul 21 15:09:00 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux >>>>>>>>>> >>>>>>>>>> Sadly, the other issue is that the grub menu will not display >>>>>>>>>> for me to >>>>>>>>>> select another kernel to see if it is just a kernel issue. >>>>>>>>>> >>>>>>>>>> The dracut prompt does not show any /dev/disk folder either. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I'm seeing this as well. My host is 4.9.44-29 and Xen 4.4.4-26 >>>>>>>>> from >>>>>>>>> testing repo, my guest is 3.10.0-693.1.1. Guest boots fine >>>>>>>>> with >>>>>>>>> 514.26.2. The kernel messages that appear to kick off the >>>>>>>>> failure for >>>>>>>>> me start with a page allocation failure. It eventually reaches >>>>>>>>> dracut >>>>>>>>> failures due to systemd/udev not setting up properly, but I >>>>>>>>> think the >>>>>>>>> root is this: >>>>>>>>> >>> <snip> >>>>>>>> >>>>>>>> Do any of you guys have access to RHEL to try the RHEL 7.4 >>>>>>>> Kernel? >>>>>>> >>>>>>> I think I may. I haven't tried yet, but I'll see if I can get my >>>>>>> hands >>>>>>> on one and test it tomorrow when I'm back at the office tomorrow. >>>>>>> >>>>>>> RH closed my bug as "WONTFIX" so far, saying Red Hat Quality >>>>>>> Engineering >>>>>>> Management declined the request. I started to look at the Red >>>>>>> Hat >>>>>>> source browser to see the list of patches from 693 to 514, but >>>>>>> getting >>>>>>> the full list seems impossible because the change log only goes >>>>>>> back to >>>>>>> 644 and there doesn't seem to be a way to obtain full builds of >>>>>>> unreleased kernels. Unless I'm mistaken. >>>>>>> >>>>>>> I will also do some digging via RH support if I can. >>>>>>> >>>>>> I would think that RH would want AWS support for RHEL 7.4 and I >>>>>> thought >>>>>> AWS was run on Xen // Note: I could be wrong about that. >>>>>> >>>>>> In any event, at the very least, we can make a kernel that boots >>>>>> PV for >>>>>> 7.4 at some point. >>>>> >>>>> AWS does run on Xen, but the modifications they make to Xen are not >>>>> known to me nor which version of Xen they use. They may also run >>>>> the >>>>> domains as HVM, which seems to mitigate the issue here. >>>>> >>>>> I just verified this kernel issue exists on a RHEL 7.3 system image >>>>> under the same conditions, when it's updated to RHEL 7.4 and kernel >>>>> 3.10.0-693.2.1.el7.x86_64. >>>>> >>>> >>>> One other option is to run the DomU's as PVHVM: >>>> https://wiki.xen.org/wiki/Xen_Linux_PV_on_HVM_drivers >>>> >>>> That should be much better performance than HVM and may be a >>>> workable >>>> solution for people who don't want to modify their VM kernel. >>>> >>>> Here is more info on PVHVM: >>>> https://wiki.xen.org/wiki/PV_on_HVM >>>> >>>> ================ >>>> Also heard from someone to try this Config file change to the base >>>> kernel and rebuild: >>>> >>>> CONFIG_RANDOMIZE_BASE=n >>> >>> This suggestion was mirrored in the RH bugzilla as well, it worked, >>> but >>> the same issue does not exist in newer kernels which have the option >>> on. >>> I've posted updated findings in the CentOS bug, which includes a >>> patch >>> that I found which seems to fix the issue: >>> >>> https://bugs.centos.org/view.php?id=13763#c30014 >> >> With many thanks to hughesjr and toracat, I was able to find a patch >> that seems to resolve this issue and get it into CentOS Plus >> 3.10.0-693.2.1. I've asked Red Hat to apply it to some future kernel >> update, but that is only a dream for now. >> >> In the meantime, if anyone who has been experiencing the issue with PV >> domains can try out the CentOS Plus kernel here and provide feedback, >> I'd appreciate it! >> >> https://buildlogs.centos.org/c7-plus/kernel-plus/20170907163005/3.10.0-693.2.1.el7.centos.plus.x86_64/ > > Loaded 3.10.0-693.2.2.el7.centos.plus.x86_64 successfully on two > CentOS 7.4 PV domUs which failed previously on > kernel-3.10.0-693.2.2.el7.x86_64, the 2 hypervisors tested are: > 1. CentOS 6.9, kernel 4.9.13-22.el6.x86_64, Xen 4.6.3-8.el6 > 2. CentOS 7.3, kernel 4.9.31-27.el7.x86_64, Xen 4.9.31-27.el7.x86_64 Should read: 1. CentOS 6.9, kernel 4.9.13-22.el6.x86_64, Xen 4.6.3-8.el6 2. CentOS 7.3, kernel 4.9.31-27.el7.x86_64, Xen 4.6.3-15.el7 --- Adi Pircalabu, System Administrator