Greetings -
The short story is that got my new install completed with the partitioning I wanted and using software raid, but after a reboot I ended up with a grub prompt, and do not appear to have a grub.cfg file. So here is a little history of how I got here, because I know in order for anyone to help me they would subsequently ask for this information. So this post is a little long, but consider it complete.
Brand new Dell system with two 3GB drives in this system with RAID1 LVM taking all the space outside the boot partitions. I initially created the sda[1,2] and sdb[1,2] partitions via GParted leaving the remaining space unpartitioned. A gpt partition table was put on both drives. During installation Anaconda recognized everything properly which resulted in the following partition summary:
sda1 /boot/efi 500 MB EFI System Partition sda2 /boot 500 MB xfs vg_jab-hostroot / 8 GB LVM xfs RAID1 vg_jab-hostvar /var 4 GB LVM xfs RAID1 vg_jab-hostswap /swap 2 GB LVM swap RAID1
The installer also recognized and listed these unknown partitions that were untouched during installation. sdb1 vfat 500 MB standard partition sdb2 vfat 500 MB standard partition
Installation proceeded successfully, and after the initial reboot of the system I used mdadm commands to watch the raid complete building before doing anything else (I know, not necessary but I am doing other things and had the time to let it complete). I rebooted the system and got a terminal prompt as expected (no GUI installed). At this point I needed to copy my /boot/efi and /boot partitions from sda[1,2] to sdb[1,2] so that the system would boot from either drive, so I issued the following sgdisk commands:
root# sgdisk -R /dev/sdb1 /dev/sda1 root# sgdisk -R /dev/sdb2 /dev/sda2 root# sgdisk -G /dev/sdb1 root# sgdisk -G /dev/sdb2
Results of the first command above: Found invalid GPT and valid MBR; converting MBR to GPT format. Warning the kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. The same note (from the Warning on) was repeated for the other three commands.
I then installed GRUB2 on /dev/sdb1 using the following command: root# grub2-install /dev/sdb1 Results: Installing for x86_64-efi platform. Installation finished. No error reported.
I rebooted the system now, only to be confronted with a GRUB prompt. Thinking that this is a good opportunity to for me to learn to rescue a system since I am going to need to understand how to recover from a disk or raid failure, I started researching and reading. It takes a little bit of work to understand what information is valuable when a lot of it refers to GRUB (not GRUB2) and doesn't make reference to UEFI booting and partitions. I found this Ubuntu wiki as a pretty good source https://help.ubuntu.com/community/Grub2/Troubleshooting#Search_.26_Set
Below is the current information of my system as seen by grub;
grub# set (the important grub2 variables are:) prefix = (hd1, gpt2)/grub2 root = hd1, gpt2
grub# ls -lha Device proc: filesystem type procfs Device hd0: no known filesystem detected Device hd1: no known filesystem detected Partition hd1, gpt3: no known filesystem detected Partition hd1, gpt2: filesystem xfs Partition hd1, gpt1: filesystem fat Device hd2: no known filesystem detected Partition hd2, gpt3: no known filesystem detected Partition hd2, gpt2: no known filesystem detected Partition hd2, gpt1: no known filesystem detected
grub# ls (hd1, gpt2) -l / /efi /grub /grub2 vmlinuz -3.10.0-123.el7.x86_64 initramfs-3.10.0-123.el7.x86_64.img ... plus some other files
Looking through the directories, I see that there is no grub.cfg file. Other than grub not recognizing the filesystem on hd2, the directories on hd2, gpt[1,2] seem to be identical to hd1, gpt[1,2] as I would assume based on the sgdisk commands I gave to copy them. My initial thinking is that I need to re-run grub2-install on hd1 (sda), but I need a running system to do that. So following the guidance I had I issued the following commands in grub to boot the system.
grub# linux /vmlinuz -3.10.0-123.el7.x86_64 root=/dev/sda2 ro grub# initrd /initramfs-3.10.0-123.el7.x86_64.img grub# boot
Unfortunately the system hung on booting, with the following information in the "journalctl" file: # journalctl Not switching root: /sysroot does not seem to be an OS tree. /etc/os-release is missing. Initrd-switch-root.service: main process exited, code=exited, status=1/FAILURE Failed to start Switch Root. . . . . . Triggering OnFailure= dependencies of initrd-switch-root.service. Starting Emergency Shell. . . Failed to issue method call: Invalid argument
Now I am not sure that I want to get misdirected to what the problem is with this boot, if I can boot from a CD in linux rescue mode and do the grub install, then be back to a booting system. So lets ignore the boot error if we can. So I boot from a CD in rescue mode, and it is only able to automatically mount sd3 under /mnt/sysimage (the LVM RAID1 containing mounts for / and /var). I am able to manually mount sda1 and sda2, but am not sure at what level in the filesystem to mount them (i.e., at /mnt/sda1 or at mnt/sysimage/sda1) in order to properly run grub2-install.
So that is where I am at now. I would like to know how to repair the system, rather than starting over on a new install. Can someone enlighten me on what I need to do from here. Also if someone can speculate on why my grub.cfg is missing in the first place I would be interested.
Also, please cc me directly on any responses, as I am only subscribed to the daily digest. Thanks.
Jeff Boyce Meridian Environmental www.meridianenv.com
On 10/12/14 18:13, Jeff Boyce wrote:
Greetings -
The short story is that got my new install completed with the partitioning I wanted and using software raid, but after a reboot I ended up with a grub prompt, and do not appear to have a grub.cfg file. So here is a little history of how I got here, because I know in order for anyone to help me they would subsequently ask for this information. So this post is a little long, but consider it complete.
Brand new Dell system with two 3GB drives in this system with RAID1 LVM taking all the space outside the boot partitions. I initially created the sda[1,2] and sdb[1,2] partitions via GParted leaving the remaining space unpartitioned. A gpt partition table was put on both drives. During installation Anaconda recognized everything properly which resulted in the following partition summary:
sda1 /boot/efi 500 MB EFI System Partition sda2 /boot 500 MB xfs vg_jab-hostroot / 8 GB LVM xfs RAID1 vg_jab-hostvar /var 4 GB LVM xfs RAID1 vg_jab-hostswap /swap 2 GB LVM swap RAID1
The installer also recognized and listed these unknown partitions that were untouched during installation. sdb1 vfat 500 MB standard partition sdb2 vfat 500 MB standard partition
Installation proceeded successfully, and after the initial reboot of the system I used mdadm commands to watch the raid complete building before doing anything else (I know, not necessary but I am doing other things and had the time to let it complete). I rebooted the system and got a terminal prompt as expected (no GUI installed). At this point I needed to copy my /boot/efi and /boot partitions from sda[1,2] to sdb[1,2] so that the system would boot from either drive, so I issued the following sgdisk commands:
root# sgdisk -R /dev/sdb1 /dev/sda1 root# sgdisk -R /dev/sdb2 /dev/sda2 root# sgdisk -G /dev/sdb1 root# sgdisk -G /dev/sdb2
Results of the first command above: Found invalid GPT and valid MBR; converting MBR to GPT format. Warning the kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. The same note (from the Warning on) was repeated for the other three commands.
I then installed GRUB2 on /dev/sdb1 using the following command: root# grub2-install /dev/sdb1 Results: Installing for x86_64-efi platform. Installation finished. No error reported.
The upstream docs (see below) seem to suggest 'grub2-install /dev/sdb' rather than /dev/sdb1 (i.e, installing to the device rather than a partition on the device). I don't know if this is the cause of your issue.
I rebooted the system now, only to be confronted with a GRUB prompt. Thinking that this is a good opportunity to for me to learn to rescue a system since I am going to need to understand how to recover from a disk or raid failure, I started researching and reading. It takes a little bit of work to understand what information is valuable when a lot of it refers to GRUB (not GRUB2) and doesn't make reference to UEFI booting and partitions. I found this Ubuntu wiki as a pretty good source https://help.ubuntu.com/community/Grub2/Troubleshooting#Search_.26_Set
I found the upstream documentation for grub2 to be useful:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm...
Included is a procedure for completely reinstalling grub2 which might help you recover.
Below is the current information of my system as seen by grub;
grub# set (the important grub2 variables are:) prefix = (hd1, gpt2)/grub2 root = hd1, gpt2
grub# ls -lha Device proc: filesystem type procfs Device hd0: no known filesystem detected Device hd1: no known filesystem detected Partition hd1, gpt3: no known filesystem detected Partition hd1, gpt2: filesystem xfs Partition hd1, gpt1: filesystem fat Device hd2: no known filesystem detected Partition hd2, gpt3: no known filesystem detected Partition hd2, gpt2: no known filesystem detected Partition hd2, gpt1: no known filesystem detected
grub# ls (hd1, gpt2) -l / /efi /grub /grub2 vmlinuz -3.10.0-123.el7.x86_64 initramfs-3.10.0-123.el7.x86_64.img ... plus some other files
Looking through the directories, I see that there is no grub.cfg file. Other than grub not recognizing the filesystem on hd2, the directories on hd2, gpt[1,2] seem to be identical to hd1, gpt[1,2] as I would assume based on the sgdisk commands I gave to copy them. My initial thinking is that I need to re-run grub2-install on hd1 (sda), but I need a running system to do that. So following the guidance I had I issued the following commands in grub to boot the system.
grub# linux /vmlinuz -3.10.0-123.el7.x86_64 root=/dev/sda2 ro grub# initrd /initramfs-3.10.0-123.el7.x86_64.img grub# boot
Unfortunately the system hung on booting, with the following information in the "journalctl" file: # journalctl Not switching root: /sysroot does not seem to be an OS tree. /etc/os-release is missing. Initrd-switch-root.service: main process exited, code=exited, status=1/FAILURE Failed to start Switch Root. . . . . . Triggering OnFailure= dependencies of initrd-switch-root.service. Starting Emergency Shell. . . Failed to issue method call: Invalid argument
Now I am not sure that I want to get misdirected to what the problem is with this boot, if I can boot from a CD in linux rescue mode and do the grub install, then be back to a booting system. So lets ignore the boot error if we can. So I boot from a CD in rescue mode, and it is only able to automatically mount sd3 under /mnt/sysimage (the LVM RAID1 containing mounts for / and /var). I am able to manually mount sda1 and sda2, but am not sure at what level in the filesystem to mount them (i.e., at /mnt/sda1 or at mnt/sysimage/sda1) in order to properly run grub2-install.
So that is where I am at now. I would like to know how to repair the system, rather than starting over on a new install. Can someone enlighten me on what I need to do from here. Also if someone can speculate on why my grub.cfg is missing in the first place I would be interested.
Also, please cc me directly on any responses, as I am only subscribed to the daily digest. Thanks.
Jeff Boyce Meridian Environmental www.meridianenv.com
----- Original Message ----- From: "Ned Slider" ned@unixmail.co.uk To: centos@centos.org Cc: jboyce@meridianenv.com Sent: Wednesday, December 10, 2014 8:53 PM Subject: Re: [CentOS] CentOS 7 grub.cfg missing on new install
On 10/12/14 18:13, Jeff Boyce wrote:
Greetings -
The short story is that got my new install completed with the partitioning I wanted and using software raid, but after a reboot I ended up with a grub prompt, and do not appear to have a grub.cfg file. So here is a little history of how I got here, because I know in order for anyone to help me they would subsequently ask for this information. So this post is a little long, but consider it complete.
. . . trim . . .
I then installed GRUB2 on /dev/sdb1 using the following command: root# grub2-install /dev/sdb1 Results: Installing for x86_64-efi platform. Installation finished. No error reported.
The upstream docs (see below) seem to suggest 'grub2-install /dev/sdb' rather than /dev/sdb1 (i.e, installing to the device rather than a partition on the device). I don't know if this is the cause of your issue.
I rebooted the system now, only to be confronted with a GRUB prompt. Thinking that this is a good opportunity to for me to learn to rescue a system since I am going to need to understand how to recover from a disk or raid failure, I started researching and reading. It takes a little bit of work to understand what information is valuable when a lot of it refers to GRUB (not GRUB2) and doesn't make reference to UEFI booting and partitions. I found this Ubuntu wiki as a pretty good source https://help.ubuntu.com/community/Grub2/Troubleshooting#Search_.26_Set
I found the upstream documentation for grub2 to be useful:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm...
Included is a procedure for completely reinstalling grub2 which might help you recover.
. . . trim . . .
Ned, thanks for your insight. I feel like I have been sleeping with that RH7 document the last day or so trying to understand what I messed up and how to recover, I just didn't reference it in my post. Your conclusion about grub2-install being directed to the partition rather than the device may be correct, and is about the only little detail that I see that may have been wrong. The weird thing is that the installation should have put everything in the proper place on the primary drive, and my grub2-install command is being directed at putting it on the secondary drive. That is what is confusing me as the proper grub files should have been on the primary drive, allowing me to boot from there. It would have been nice if I had happened to check for the grub files before the failed reboot, or immediately after the installation. I think at this point I am going to not try and recover, but just re-install from scratch. I have gained enough knowledge in the past few days learning about grub that at least I know the general process and how to get started, but at this point I want to make sure I have a good clean system on the initial install. Thanks to others who at least took the time to read my long post.
Jeff
On 12/10/2014 10:13 AM, Jeff Boyce wrote:
The short story is that got my new install completed with the partitioning I wanted and using software raid, but after a reboot I ended up with a grub prompt, and do not appear to have a grub.cfg file.
...
I initially created the sda[1,2] and sdb[1,2] partitions via GParted leaving the remaining space unpartitioned.
I'm pretty sure that's not necessary. I've been able to simply change the device type to RAID in the installer and get mirrored partitions. If you do your setup entirely in Anaconda, your partitions should all end up fine.
At this point I needed to copy my /boot/efi and /boot partitions from sda[1,2] to sdb[1,2] so that the system would boot from either drive, so I issued the following sgdisk commands:
root# sgdisk -R /dev/sdb1 /dev/sda1 root# sgdisk -R /dev/sdb2 /dev/sda2 root# sgdisk -G /dev/sdb1 root# sgdisk -G /dev/sdb2
sgdisk manipulates GPT, so you run it on the disk, not on individual partitions. What you've done simply scrambled information in sdb1 and sdb2.
The correct way to run it would be # sgdisk -R /dev/sdb /dev/sda # sgdisk -G /dev/sdb
However, you would only do that if sdb were completly unpartitioned. As you had already made at least one partition on sdb a member of a RAID1 set, you should not do either of those things.
The entire premise of what you're attempting is flawed. Making a partition into a RAID member is destructive. mdadm writes its metadata inside of the member partition. The only safe way to convert a filesystem is to back up its contents, create the RAID set, format the RAID volume, and restore the backup. Especially with UEFI, there are a variety of ways that can fail. Just set up the RAID sets in the installer.
I then installed GRUB2 on /dev/sdb1 using the following command: root# grub2-install /dev/sdb1 Results: Installing for x86_64-efi platform. Installation finished. No error reported.
Again, you can do that, but it's not what you wanted to do. GRUB2 is normally installed on the drive itself, unless there's a chain loader that will load it from the partition where you've installed it. You wanted to: # grub2-install /dev/sdb
I rebooted the system now, only to be confronted with a GRUB prompt.
I'm guessing that you also constructed RAID1 volumes before rebooting, since you probably wouldn't install GRUB2 until you did so, and doing so would explain why GRUB can't find its configuration file (the filesystem has been damaged), and why GRUB shows "no known filesystem detected" on the first partition of hd1.
If so, that's expected. You can't convert a partition in-place.
Looking through the directories, I see that there is no grub.cfg file.
It would normally be in the first partition, which GRUB cannot read on your system.
So following the guidance I had I issued the following commands in grub to boot the system.
grub# linux /vmlinuz -3.10.0-123.el7.x86_64 root=/dev/sda2 ro grub# initrd /initramfs-3.10.0-123.el7.x86_64.img grub# boot
Unfortunately the system hung on booting, with the following information in the "journalctl" file: # journalctl Not switching root: /sysroot does not seem to be an OS tree. /etc/os-release is missing.
On your system, /dev/sda2 is "/boot" not the root filesystem. Your "root=" arg should refer to your root volume, which should be something like "root=/dev/mapper/vg_jab-hostroot". dracut may also need additional args to initialize LVM2 volumes correctly, such as "rd.lvm.lv=vg_jab/hostroot". If you had encrypted your filesystems, it would also need the uuid of the LUKS volume.
----- Original Message ----- From: "Gordon Messmer" gordon.messmer@gmail.com To: "CentOS mailing list" centos@centos.org Cc: "Jeff Boyce" jboyce@meridianenv.com Sent: Thursday, December 11, 2014 9:45 AM Subject: Re: [CentOS] CentOS 7 grub.cfg missing on new install
On 12/10/2014 10:13 AM, Jeff Boyce wrote:
The short story is that got my new install completed with the partitioning I wanted and using software raid, but after a reboot I ended up with a grub prompt, and do not appear to have a grub.cfg file.
...
I initially created the sda[1,2] and sdb[1,2] partitions via GParted leaving the remaining space unpartitioned.
I'm pretty sure that's not necessary. I've been able to simply change the device type to RAID in the installer and get mirrored partitions. If you do your setup entirely in Anaconda, your partitions should all end up fine.
It may not be absolutely necessary, but it appears to me to be the only way to get to my objective. The /boot/efi has to be on a separate partition, and it can not be on a RAID device. The /boot can be on LVM according to the documentation I have seen, but Anaconda will give you an error and not proceed if it is. Someone pointed this out to me a few days ago, that this is by design in RH and CentOS. And within the installer I could not find a way to put /boot on a non-LVM RAID1 while the rest of my drive is setup with LVM RAID1. So that is when I went to GParted to manually setup the /boot/efi and /boot partitions before running the installer.
At this point I needed to copy my /boot/efi and /boot partitions from sda[1,2] to sdb[1,2] so that the system would boot from either drive, so I issued the following sgdisk commands:
root# sgdisk -R /dev/sdb1 /dev/sda1 root# sgdisk -R /dev/sdb2 /dev/sda2 root# sgdisk -G /dev/sdb1 root# sgdisk -G /dev/sdb2
sgdisk manipulates GPT, so you run it on the disk, not on individual partitions. What you've done simply scrambled information in sdb1 and sdb2.
The correct way to run it would be # sgdisk -R /dev/sdb /dev/sda # sgdisk -G /dev/sdb
Point taken, I am going back to read the sgdisk documentation again. I had assumed that this would be a more technically accurate way to copy sda[1,2] to sdb[1,2] rather than using dd as a lot of how-to's suggest.
However, you would only do that if sdb were completly unpartitioned. As you had already made at least one partition on sdb a member of a RAID1 set, you should not do either of those things.
The entire premise of what you're attempting is flawed. Making a partition into a RAID member is destructive. mdadm writes its metadata inside of the member partition. The only safe way to convert a filesystem is to back up its contents, create the RAID set, format the RAID volume, and restore the backup. Especially with UEFI, there are a variety of ways that can fail. Just set up the RAID sets in the installer.
I need some additional explanation of what you are trying to say here, as I don't understand it. My objective is to have the following layout for my two 3TB disks.
sda1 /boot/efi sda2 /boot sda3 RAID1 with sdb3
sdb1 /boot/efi sdb2 /boot sdb3 RAID1 with sda3
I just finished re-installing using my GParted prepartitioned layout and I have a bootable system with sda1 and sda2 mounted, and md127 created from sda3 and sdb3. My array is actively resyncing, and I have successfully rebooted a couple of times without a problem. My goal now it to make sdb bootable for the case when/if sda fails. This is the process that I now believe I failed on previously, and it likely has to do with issueing the sgdisk command to a partition rather than a device. But even so, I don't understand why it would have messed with my first device that had been bootable.
I then installed GRUB2 on /dev/sdb1 using the following command: root# grub2-install /dev/sdb1 Results: Installing for x86_64-efi platform. Installation finished. No error reported.
Again, you can do that, but it's not what you wanted to do. GRUB2 is normally installed on the drive itself, unless there's a chain loader that will load it from the partition where you've installed it. You wanted to: # grub2-install /dev/sdb
Yes, I am beginning to think this is correct, and as mentioned above am going back to re-read the sgdisk documentation.
I rebooted the system now, only to be confronted with a GRUB prompt.
I'm guessing that you also constructed RAID1 volumes before rebooting, since you probably wouldn't install GRUB2 until you did so, and doing so would explain why GRUB can't find its configuration file (the filesystem has been damaged), and why GRUB shows "no known filesystem detected" on the first partition of hd1.
If so, that's expected. You can't convert a partition in-place.
Looking through the directories, I see that there is no grub.cfg file.
It would normally be in the first partition, which GRUB cannot read on your system.
So following the guidance I had I issued the following commands in grub to boot the system.
grub# linux /vmlinuz -3.10.0-123.el7.x86_64 root=/dev/sda2 ro grub# initrd /initramfs-3.10.0-123.el7.x86_64.img grub# boot
Unfortunately the system hung on booting, with the following information in the "journalctl" file: # journalctl Not switching root: /sysroot does not seem to be an OS tree. /etc/os-release is missing.
On your system, /dev/sda2 is "/boot" not the root filesystem. Your "root=" arg should refer to your root volume, which should be something like "root=/dev/mapper/vg_jab-hostroot". dracut may also need additional args to initialize LVM2 volumes correctly, such as "rd.lvm.lv=vg_jab/hostroot". If you had encrypted your filesystems, it would also need the uuid of the LUKS volume.