[CentOS] CentOS7 latest kernel still does not run KVM guests

Tue Dec 27 05:40:20 UTC 2022
Nathan Coulson <nathan at bravenet.com>

There is https://bugzilla.redhat.com/show_bug.cgi?id=2143438 which I ran 
into on AlmaLinux 8, (which impacts Xeon 55xx processors, but seems to 
work on Xeon 56xx and newer).

Just a user who ran into it (nothing to do with the fix), and I upgraded 
the server to a Xeon 56XX processor to resolve it.  (an ancient Intel 
S5520HC board)

On 2022-12-20 09:08, Joshua Kramer wrote:
> "In fact, the system runs great on the newest kernel, right up to the
> point where a VM is started.  It will run for days as long as I never
> start a VM.  Start a VM and BAM!  It is hung hard."
> 
> Are you required to use "official supported kernels" or do you have 
> some
> flexibility?  My main KVM host is a Centos 7 box and I'm using 
> kernel-ml
> from elrepo-kernel.  The kernel version usually tracks with recent 
> kernel
> releases- on my el9 boxes it's currently at 6.1- but I'm running 5.19 
> on my
> Centos 7 KVM host with no issue.
> 
> On Tue, Dec 20, 2022 at 10:24 AM Bill Gee <bgee at campercaver.net> wrote:
> 
>> Hmmm....   I have dealt with bad power supplies.  I doubt it is the
>> problem in this case.  If it were a power supply, then why does the
>> system work perfectly on the older kernel?
>> 
>> In fact, the system runs great on the newest kernel, right up to the
>> point where a VM is started.  It will run for days as long as I never
>> start a VM.  Start a VM and BAM!  It is hung hard.
>> 
>> ===============
>> Bill Gee
>> 
>> On 12/20/22 08:30, Christopher Wensink wrote:
>> > I have had two different Router machines do something similar on the
>> > IPFire OS, and the core cause ended up being power related.  One power
>> > supply was intermittently dying with the whole system hanging, and the
>> > only option was a hard reset.  The second system also had issues with
>> > the hard drive and also the power supply.  These units were Mini ITX
>> > boards, Super micro Sys-E200-9B with the Pentium N3710 Quad Core,
>> > System-on-chip, 8GB Ram, 120 GB SSD, Quad NIC Cards, and they used
>> > external 60W DC power adapters similar to a higher end laptop style.
>> >
>> > I don't blame the manufacturer, I think it was an issue with the power
>> > supplies going bad.
>> >
>> > Chris
>> >
>> > On 12/20/2022 8:16 AM, Bill Gee wrote:
>> >> Hi Johnny -
>> >>
>> >> Yipes, I hate problems like this!
>> >>
>> >> The host computer is a SuperMicro C2SBC-Q mainboard.  The processor is
>> >> an Intel Core2-Quad Q9400.  Yes, it is x86_64 architecture.  The
>> >> display adapter is an older nVidia GeForce 8400 GS, and I use the
>> >> nouveau driver for it.  Selinux is disabled.
>> >>
>> >> The guest machines are Fedora 37 and CentOS7.
>> >>
>> >> So far I have found no log files with anything useful.  The hang
>> >> happens so quick that nothing gets logged. Here is a section of
>> >> /var/log/messages.  Notice the gap at 06:32 to 06:43.  This is where I
>> >> started a virtual guest and the system hung.  At reboot I chose a
>> >> different kernel.
>> >>
>> >> ======================
>> >> Dec 20 06:32:26 practice7 systemd: Starting Fingerprint Authentication
>> >> Daemon...
>> >> Dec 20 06:32:26 practice7 dbus[750]: [system] Successfully activated
>> >> service 'net.reactivated.Fprint'
>> >> Dec 20 06:32:26 practice7 systemd: Started Fingerprint Authentication
>> >> Daemon.
>> >> Dec 20 06:32:26 practice7 dbus[750]: [system] Activating via systemd:
>> >> service name='org.freedesktop.realmd' unit='realmd.service'
>> >> Dec 20 06:32:26 practice7 systemd: Starting Realm and Domain
>> >> Configuration...
>> >> Dec 20 06:32:26 practice7 dbus[750]: [system] Successfully activated
>> >> service 'org.freedesktop.realmd'
>> >> Dec 20 06:32:26 practice7 systemd: Started Realm and Domain
>> >> Configuration.
>> >> Dec 20 06:32:48 practice7 systemd: Starting Stop Read-Ahead Data
>> >> Collection...
>> >> Dec 20 06:32:48 practice7 systemd: Started Stop Read-Ahead Data
>> >> Collection.
>> >> Dec 20 06:43:08 practice7 journal: Runtime journal is using 8.0M (max
>> >> allowed 391.0M, trying to leave 586.5M free of 3.8G available →
>> >> current limit 391.0M).
>> >> Dec 20 06:43:08 practice7 kernel: microcode: microcode updated early
>> >> to revision 0xa0b, date = 2010-09-28
>> >> Dec 20 06:43:08 practice7 kernel: Initializing cgroup subsys cpuset
>> >> Dec 20 06:43:08 practice7 kernel: Initializing cgroup subsys cpu
>> >> Dec 20 06:43:08 practice7 kernel: Initializing cgroup subsys cpuacct
>> >> Dec 20 06:43:08 practice7 kernel: Linux version
>> >> 3.10.0-1160.76.1.el7.x86_64 (mockbuild at kbuilder.bsys.centos.org) (gcc
>> >> version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Wed Aug 10
>> >> 16:21:17 UTC 2022
>> >>
>> >> ==========================
>> >>
>> >> Is there anyplace else I should look for log files?  Is there a way to
>> >> get verbose logging?
>> >>
>> >> How might I check for a kernel panic?  The display never says anything
>> >> about a kernel panic - it just hangs.
>> >>
>> >> Thanks!
>> >>
>> >> ===============
>> >> Bill Gee
>> >>
>> >> On 12/20/22 07:25, Johnny Hughes wrote:
>> >>> On 12/20/22 06:50, Bill Gee wrote:
>> >>>> The two latest kernels for CentOS7 are complete fails for running
>> >>>> KVM and QEMU guest machines.
>> >>>>
>> >>>> Version 3.10.0-1160.76.1 works correctly.  Both 3.10.0-1160.80.1 and
>> >>>> 3.10.0-1160.81.1 will hang within seconds of launching any virtual
>> >>>> machine.  It is a HARD hang.  I have to pull the power cord from the
>> >>>> computer in order to regain control.
>> >>>>
>> >>>> Since 81.1 came out within the last few days, I assumed it would
>> >>>> contain a fix for this problem.  It does not.
>> >>>>
>> >>>> Does anyone know when a kernel will be released that fixed this
>> >>>> problem?
>> >>>>
>> >>> This is not true for all KVM guests.  This kernel is actually
>> >>> installed and test booted before release on a cold iron, KVM VM,
>> >>> Virtual Box VM, and ESXi VM.
>> >>>
>> >>> It also passes our t_functional test suite:
>> >>>
>> >>>
>> >>>
>> >>> All C7 updates run through all these tests for all new rpms.
>> >>>
>> >>> So this problem has some other specific cause.  Is this on a E5507
>> >>> processor?
>> >>>
>> >>> What OS is the KVM host running.  I assume this is x86_64 arch?
>> >>> _______________________________________________
>> >>> CentOS mailing list
>> >>> CentOS at centos.org
>> >>> https://lists.centos.org/mailman/listinfo/centos
>> >> _______________________________________________
>> >> CentOS mailing list
>> >> CentOS at centos.org
>> >> https://lists.centos.org/mailman/listinfo/centos
>> >>
>> >
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> https://lists.centos.org/mailman/listinfo/centos
>> 
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos