I've just added the following to the CentOS bugtracker for CentOS-7 0009860. I admit to not being sure if it's the same issue, or a separate one, but this and other Dell servers - I *think* they're all R420's, but I could be wrong, just all do the same thing on boot. ***************** I've just updated a CentOS 7 server to the latest kernel, vmlinuz-3.10.0-327.4.5.el7.x86_64, and the server fails to boot. It has failed on every 327 kernel.
Server: Dell R420, 2 Xeons, 124G RAM.
From the rdsosreport.txt, the relevant portion is:
[ 3.317974] <servername> systemd[1]: Starting File System Check on /dev/disk// by-label/\x2f... [ 3.320089] <servername> systemd-fsck[590]: Failed to detect device /dev/diskk /by-label// [ 3.320567] <servername> systemd[1]: systemd-fsck-root.service: main process exited, code=exited, status=1/FAILURE [ 3.320972] <servername> systemd[1]: Failed to start File System Check on /dee v/disk/by-label/\x2f. [ 3.321423] <servername> systemd[1]: Dependency failed for /sysroot. [ 3.321872] <servername> systemd[1]: Dependency failed for Initrd Root File SS ystem. [ 3.322335] <servername> systemd[1]: Dependency failed for Reload Configuratii on from the Real Root. [ 3.322802] <servername> systemd[1]: Job initrd-parse-etc.service/start failee d with result 'dependency'. [ 3.323266] <servername> systemd[1]: Triggering OnFailure= dependencies of inn itrd-parse-etc.service. [ 3.323697] <servername> systemd[1]: Job initrd-root-fs.target/start failed ww ith result 'dependency'. 3.323266] <servername> systemd[1]: Triggering OnFailure= dependencies of inn itrd-parse-etc.service. [ 3.323697] <servername> systemd[1]: Job initrd-root-fs.target/start failed ww ith result 'dependency'. [ 3.324161] <servername> systemd[1]: Triggering OnFailure= dependencies of inn itrd-root-fs.target. [ 3.324586] <servername> systemd[1]: Job sysroot.mount/start failed with resuu lt 'dependency'. [ 3.324998] <servername> systemd[1]: Unit systemd-fsck-root.service entered ff ailed state. [ 3.325430] <servername> systemd[1]: systemd-fsck-root.service failed. [ 3.326752] <servername> systemd[1]: Stopped dracut pre-pivot and cleanup hooo
And it stops, and drops me into the rdshell. Not that I can mkdir /mnt, and mount /dev/sda1, and /boot is there, and I can mount /dev/sda3, and root is there just fine.
mark
On Jan 27, 2016, at 1:47 PM, m.roth@5-cent.us wrote:
I've just added the following to the CentOS bugtracker for CentOS-7 0009860. I admit to not being sure if it's the same issue, or a separate one, but this and other Dell servers - I *think* they're all R420's, but I could be wrong, just all do the same thing on boot.
I've just updated a CentOS 7 server to the latest kernel, vmlinuz-3.10.0-327.4.5.el7.x86_64, and the server fails to boot. It has failed on every 327 kernel.
Server: Dell R420, 2 Xeons, 124G RAM.
I have the same issue on a 2011 iMac. Usually a it takes one or two rounds of kernels more and it starts working, but I have to stay on 3.10.0-229.20.1 right now. All the 327’s crash on boot.
-wes
-----Original Message----- From: Wes James [mailto:comptekki@me.com] Sent: Wednesday, January 27, 2016 7:04 PM To: CentOS mailing list Subject: Re: [CentOS] CentOS 7, 327 kernel still crashing
On Jan 27, 2016, at 1:47 PM, m.roth@5-cent.us wrote:
I've just added the following to the CentOS bugtracker for CentOS-7 0009860. I admit to not being sure if it's the same issue, or a separate one, but this and other Dell servers - I *think* they're all R420's, but I could be wrong, just all do the same thing on boot.
I've just updated a CentOS 7 server to the latest kernel, vmlinuz-3.10.0-327.4.5.el7.x86_64, and the server fails to boot. It has failed on every 327 kernel.
Server: Dell R420, 2 Xeons, 124G RAM.
I have the same issue on a 2011 iMac. Usually a it takes one or two rounds of kernels more and it starts working, but I have to stay on 3.10.0-229.20.1 right now. All the 327’s crash on boot.
-wes
The `rpm -q --changelog ` of the 327 kernel looks like they only made three 'important' changes, and I think gives pointers to kernel.org changes you could use find the offending patches. Have you folks considered grabbing the srpm, backing out the each of the (three) changes between the pre 327 and 327 and building it yourself to figure out which thing broke your systems? Do either of you have any of the equipment listed in the 327 change? If so, that equipment patch is the patch I would focus on. Of course this will have you stepping off the CentOS reservation (thus use caution), but seeing as you are hanging back at 229, you are already on the fence. :)
When you can point to the problem http://thread.gmane.org/gmane.linux.network.drbd/9973/focus=9996 sometimes folks will get it fixed quickly http://thread.gmane.org/gmane.linux.network.drbd/9973/focus=9996 I grant you, it was much easier back then, because the fedora and RH folks would have the patches as ... patches ... in the rpm that you could take out with a comment, but it can still be done with a more research. Even more fun might be to see if the elrepo kernel-lt or kernel-ml would work.
Even when this disclaimer is not here: I am not a contracting officer. I do not have authority to make or modify the terms of any contract.
Denniston, Todd A CIV NAVSURFWARCENDIV Crane wrote:
From: Wes James [mailto:comptekki@me.com]
On Jan 27, 2016, at 1:47 PM, m.roth@5-cent.us wrote:
I've just added the following to the CentOS bugtracker for CentOS-7 0009860. I admit to not being sure if it's the same issue, or a separate one, but this and other Dell servers - I *think* they're all R420's, but I could be wrong, just all do the same thing on boot.
I've just updated a CentOS 7 server to the latest kernel, vmlinuz-3.10.0-327.4.5.el7.x86_64, and the server fails to boot. It has failed on every 327 kernel.
Server: Dell R420, 2 Xeons, 124G RAM.
I have the same issue on a 2011 iMac. Usually a it takes one or two rounds of kernels more and it starts working, but I have to stay on 3.10.0-229.20.1 right now. All the 327’s crash on boot.
The `rpm -q --changelog ` of the 327 kernel looks like they only made three 'important' changes, and I think gives pointers to kernel.org changes you could use find the offending patches. Have you folks considered grabbing the srpm, backing out the each of the (three) changes between the pre 327 and 327 and building it yourself to figure out which thing broke your systems?
<snip> Sorry, I really don't have the time.
HOWEVER, here's an additional datum: I just updated some servers, and one failed to reboot, also dropping into the rdshell The thing is, this was vmlinuz-3.10.0-229.20.1.el7.x86_64, *not* a 327. When I went back to vmlinuz-3.10.0-229.14.1.el7.x86_64, I had no trouble.
Note: in the rdshell, both with any 327 kernel, or with the 229-20, I had zero issues when I made a mountpoint and mounted /boot or /.
I saved the rdshell from this morning, and have the ok to look more closely. I will note this: I'm now starting to wonder if this is possibly a systemd issue... or a grub2 issue.
mark
Ok, more info. I've just looked at the rdsosreport from a 327 kernel, and the one from this morning, from the 229-20 kernel, and I see where they croak: [ 3.045600] lym.cit.nih.gov systemd[1]: Found device ST500NM0003-9ZM172 /. [ 3.045950] lym.cit.nih.gov systemd[1]: Starting File System Check on /dev/disk/by-label/\x2f... [ 3.047209] lym.cit.nih.gov systemd-fsck[575]: Failed to detect device /dev/disk/by-label// [ 3.047337] lym.cit.nih.gov systemd[1]: systemd-fsck-root.service: main process exited, code=exited, status=1/FAILURE [ 3.047449] lym.cit.nih.gov systemd[1]: Failed to start File System Check on /dev/disk/by-label/\x2f. [ 3.047559] lym.cit.nih.gov systemd[1]: Dependency failed for /sysroot.
and yet. starting at line 75 of 1281, I see + ls -l /dev/disk/by-id /dev/disk/by-label /dev/disk/by-path /dev/disk/by-uuid <...> /dev/disk/by-label: total 0 lrwxrwxrwx 1 root 0 10 Jan 29 14:27 SWAP-sda2 -> ../../sda2 lrwxrwxrwx 1 root 0 10 Jan 29 14:27 \x2f -> ../../sda3 lrwxrwxrwx 1 root 0 10 Jan 29 14:27 \x2fboot -> ../../sda1
So, at some point, it seems to have lost the visibility to /dev/disk/by-label.
Any thoughts, here?
mark