Hi All,
I have a Dell R710 that has 6x1TB in a RAID-5 configuration. When installing CentOS 7 using the full disk capacity and booting in UEFI mode the machine dumps me into a GRUB rescue mode prompt.
error: disk `,gpt2' not found Entering rescue mode... grub rescue>
If I use the PERC RAID controller to make the disk smaller ROOTDISK volume of 100GB in size and then a DATA volume of the rest of the disk the system boots just fine so this seems like a GRUB2 issue. Any thoughts?
On Thu, Aug 18, 2016 at 11:57 AM, James A. Peltier jpeltier@sfu.ca wrote:
Hi All,
I have a Dell R710 that has 6x1TB in a RAID-5 configuration.
This is hardware RAID 5? Because it's pretty screwy how this ends up working when using software RAID and might take additional troubleshooting.
When installing CentOS 7 using the full disk capacity and booting in UEFI mode the machine dumps me into a GRUB rescue mode prompt. error: disk `,gpt2' not found Entering rescue mode... grub rescue>
This is confusing to me because there should be no such thing as grub rescue on UEFI. On BIOS systems, there is boot.img (formerly stage 1) and core.img in the MBR gap or on BIOS Boot if GPT disk (formerly stage 1.5 and stage 2). The core.img is where grub rescue comes from when it can't find grub modules, in particular normal.mod.
But on UEFI, core.img, normal.mod, and a pile of other modules are all baked into the grubx64.efi file founds on the EFI system partition.
I suspect two things that can cause normal.mod to not be found: a. The system is not in fact booting in UEFI mode and there's been some mistake in the installation of grub. b. The system is in UEFI mode, but either the installer, or post-install, grub2-install was run which obliterates the grub2-efi package installed grubx64.efi, i.e. it's not really proper to run grub2-install on UEFI systems.
Boot off install media with boot parameter inst.rescue and choose all the default options; this ought to assemble the file system per fstab, and you can
chroot /mnt/sysimage yum reinstall grub2-efi efibootmgr -v grep efibootmgr /var/log/anaconda/program.log ## I think that's right it might be anaconda.program.log though
It's really just reinstalling grub2-efi that should fix the problem, the following two options are just information gathering in case the reboot still doesn't work.
----- Original Message ----- | On Thu, Aug 18, 2016 at 11:57 AM, James A. Peltier jpeltier@sfu.ca wrote: | > Hi All, | > | > I have a Dell R710 that has 6x1TB in a RAID-5 configuration. | | | This is hardware RAID 5? Because it's pretty screwy how this ends up | working when using software RAID and might take additional | troubleshooting.
Yes, it's a Dell R710XD
| > When installing CentOS 7 using the full disk capacity and booting in UEFI | > mode the machine dumps me into a GRUB rescue mode prompt. | > error: disk `,gpt2' not found | > Entering rescue mode... | > grub rescue> | | | This is confusing to me because there should be no such thing as grub | rescue on UEFI. On BIOS systems, there is boot.img (formerly stage 1) | and core.img in the MBR gap or on BIOS Boot if GPT disk (formerly | stage 1.5 and stage 2). The core.img is where grub rescue comes from | when it can't find grub modules, in particular normal.mod. | | But on UEFI, core.img, normal.mod, and a pile of other modules are all | baked into the grubx64.efi file founds on the EFI system partition. | | I suspect two things that can cause normal.mod to not be found: | a. The system is not in fact booting in UEFI mode and there's been | some mistake in the installation of grub. | b. The system is in UEFI mode, but either the installer, or | post-install, grub2-install was run which obliterates the grub2-efi | package installed grubx64.efi, i.e. it's not really proper to run | grub2-install on UEFI systems.
I suspect this is the case. when attempting to run grub-install the system claims that the grub2-efi-modules packages aren't installed, so this may be an installer bug.
| Boot off install media with boot parameter inst.rescue and choose all | the default options; this ought to assemble the file system per fstab, | and you can | | chroot /mnt/sysimage | yum reinstall grub2-efi | efibootmgr -v | grep efibootmgr /var/log/anaconda/program.log ## I think that's | right it might be anaconda.program.log though | | | It's really just reinstalling grub2-efi that should fix the problem, | the following two options are just information gathering in case the | reboot still doesn't work.
We'll try this and get back to you soon.
On Fri, Aug 19, 2016 at 4:59 PM, James A. Peltier jpeltier@sfu.ca wrote:
----- Original Message ----- | On Thu, Aug 18, 2016 at 11:57 AM, James A. Peltier jpeltier@sfu.ca wrote: | > Hi All, | > | > I have a Dell R710 that has 6x1TB in a RAID-5 configuration. | | | This is hardware RAID 5? Because it's pretty screwy how this ends up | working when using software RAID and might take additional | troubleshooting.
Yes, it's a Dell R710XD
| > When installing CentOS 7 using the full disk capacity and booting in UEFI | > mode the machine dumps me into a GRUB rescue mode prompt. | > error: disk `,gpt2' not found | > Entering rescue mode... | > grub rescue> | | | This is confusing to me because there should be no such thing as grub | rescue on UEFI. On BIOS systems, there is boot.img (formerly stage 1) | and core.img in the MBR gap or on BIOS Boot if GPT disk (formerly | stage 1.5 and stage 2). The core.img is where grub rescue comes from | when it can't find grub modules, in particular normal.mod. | | But on UEFI, core.img, normal.mod, and a pile of other modules are all | baked into the grubx64.efi file founds on the EFI system partition. | | I suspect two things that can cause normal.mod to not be found: | a. The system is not in fact booting in UEFI mode and there's been | some mistake in the installation of grub. | b. The system is in UEFI mode, but either the installer, or | post-install, grub2-install was run which obliterates the grub2-efi | package installed grubx64.efi, i.e. it's not really proper to run | grub2-install on UEFI systems.
I suspect this is the case. when attempting to run grub-install the system claims that the grub2-efi-modules packages aren't installed, so this may be an installer bug.
What is attempting to run grub-install? Or even grub2-install? If the installer is doing this, it's an installer bug. If the user is doing it, it's user error.
Also, you will need to check the NVRAM for stale values because grub2-install also populates NVRAM with what will become the wrong entry. You'll need to use 'efibootmgr -v' to get a listing to find the bogus entry, which will be pointing to a path that includes grubx64.efi, note the boot number and then do 'efibootmgr -b <bootnum> -B' Where bootnum is the four digit value for the bogus entry.
What should happen if there are no valid entries is shim.efi will work with fallback.efi to create a proper NVRAM entry. The proper entry can be found with the earlier grep efibootmgr command, and you can just use that, while adding an additional \ for each , so that it's \. NVRAM should point to shim.efi and it's shim.efi that loads the prebaked grubx64.efi.
----- Original Message ----- | On Fri, Aug 19, 2016 at 4:59 PM, James A. Peltier jpeltier@sfu.ca wrote: | > | > | > ----- Original Message ----- | > | On Thu, Aug 18, 2016 at 11:57 AM, James A. Peltier jpeltier@sfu.ca | > | wrote: | > | > Hi All, | > | > | > | > I have a Dell R710 that has 6x1TB in a RAID-5 configuration. | > | | > | | > | This is hardware RAID 5? Because it's pretty screwy how this ends up | > | working when using software RAID and might take additional | > | troubleshooting. | > | > Yes, it's a Dell R710XD | > | > | > When installing CentOS 7 using the full disk capacity and booting in | > | > UEFI | > | > mode the machine dumps me into a GRUB rescue mode prompt. | > | > error: disk `,gpt2' not found | > | > Entering rescue mode... | > | > grub rescue> | > | | > | | > | This is confusing to me because there should be no such thing as grub | > | rescue on UEFI. On BIOS systems, there is boot.img (formerly stage 1) | > | and core.img in the MBR gap or on BIOS Boot if GPT disk (formerly | > | stage 1.5 and stage 2). The core.img is where grub rescue comes from | > | when it can't find grub modules, in particular normal.mod. | > | | > | But on UEFI, core.img, normal.mod, and a pile of other modules are all | > | baked into the grubx64.efi file founds on the EFI system partition. | > | | > | I suspect two things that can cause normal.mod to not be found: | > | a. The system is not in fact booting in UEFI mode and there's been | > | some mistake in the installation of grub. | > | b. The system is in UEFI mode, but either the installer, or | > | post-install, grub2-install was run which obliterates the grub2-efi | > | package installed grubx64.efi, i.e. it's not really proper to run | > | grub2-install on UEFI systems. | > | > I suspect this is the case. when attempting to run grub-install the system | > claims that the grub2-efi-modules packages aren't installed, so this may | > be an installer bug. | | What is attempting to run grub-install? Or even grub2-install? If the | installer is doing this, it's an installer bug. If the user is doing | it, it's user error. | | Also, you will need to check the NVRAM for stale values because | grub2-install also populates NVRAM with what will become the wrong | entry. You'll need to use 'efibootmgr -v' to get a listing to find the | bogus entry, which will be pointing to a path that includes | grubx64.efi, note the boot number and then do 'efibootmgr -b <bootnum> | -B' Where bootnum is the four digit value for the bogus entry. | | What should happen if there are no valid entries is shim.efi will work | with fallback.efi to create a proper NVRAM entry. The proper entry can | be found with the earlier grep efibootmgr command, and you can just | use that, while adding an additional \ for each , so that it's \. | NVRAM should point to shim.efi and it's shim.efi that loads the | prebaked grubx64.efi.
When running grub2-install from within recovery mode I can assure you it is not a user error because simply installing the grub2-efi-modules package allows for grub2-install to work. We will try the efibootmgr -v out too.
On Mon, Aug 22, 2016 at 1:24 PM, James A. Peltier jpeltier@sfu.ca wrote:
When running grub2-install from within recovery mode I can assure you it is not a user error because simply installing the grub2-efi-modules package allows for grub2-install to work.
No, this logic is flawed. Running grub2-install is obsolete on UEFI, it only applies for users who know exactly what they're getting themselves into and have a use case for modules in grub2-efi-modules that are not already in the grubx64.efi binary that's included in the grub2-efi package. If you run grub2-install, it blows away that grubx64.efi from the grub2-efi package in favor of a custom built one, which has completely different and for the most part undocumented behavior.
For example the grubx64.efi bootloader in grub2-efi expects to find grub.cfg on the ESP in the same directory as the grubx64.efi binary. If you run grub2-install, the resulting grubx64.efi expects to find grub.cfg in /boot/grub2/ which is on your boot volume, not the EFI System Partition. If this is UEFI system with Secure Boot enabled, the grub2-install created grubx64.efi is not signed, so it'll fail Secure Boot unless you go down the rabbit hole of signing it yourself. Whereas the CentOS supplied grubx64.efi in the grub2-efi package is already signed. And so on.
How are you booting the CentOS installation media? How was that media created? This matters because it's possible to end up with a CSM-BIOS boot inadvertently, and the installer will install a grub for BIOS firmware, and not the entirely separate bootloader for UEFI. So it might be worth booting from that install media, and get to a shell and check if in fact this is an UEFI mode boot by running efibootmgr. If you get an error message, it's not a UEFI mode boot, it's using CSM-BIOS mode, and that would explain why the wrong bootloader is being installed by the installer.
----- Original Message ----- | On Mon, Aug 22, 2016 at 1:24 PM, James A. Peltier jpeltier@sfu.ca wrote: | > | > | > When running grub2-install from within recovery mode I can assure you it is | > not a user error because simply installing the grub2-efi-modules package | > allows for grub2-install to work. | | No, this logic is flawed. Running grub2-install is obsolete on UEFI, | it only applies for users who know exactly what they're getting | themselves into and have a use case for modules in grub2-efi-modules | that are not already in the grubx64.efi binary that's included in the | grub2-efi package. If you run grub2-install, it blows away that | grubx64.efi from the grub2-efi package in favor of a custom built one, | which has completely different and for the most part undocumented | behavior.
Perhaps this should be documented then. I had assumed (yes, I know) to this working the same on an EFI based system system since apparently this it isn't documented.
| For example the grubx64.efi bootloader in grub2-efi expects to find | grub.cfg on the ESP in the same directory as the grubx64.efi binary. | If you run grub2-install, the resulting grubx64.efi expects to find | grub.cfg in /boot/grub2/ which is on your boot volume, not the EFI | System Partition. If this is UEFI system with Secure Boot enabled, the | grub2-install created grubx64.efi is not signed, so it'll fail Secure | Boot unless you go down the rabbit hole of signing it yourself. | Whereas the CentOS supplied grubx64.efi in the grub2-efi package is | already signed. And so on. | | How are you booting the CentOS installation media? How was that media | created? This matters because it's possible to end up with a CSM-BIOS | boot inadvertently, and the installer will install a grub for BIOS | firmware, and not the entirely separate bootloader for UEFI. So it | might be worth booting from that install media, and get to a shell and | check if in fact this is an UEFI mode boot by running efibootmgr. If | you get an error message, it's not a UEFI mode boot, it's using | CSM-BIOS mode, and that would explain why the wrong bootloader is | being installed by the installer.
We're doing a pretty standard iPXE/Kickstart installation that has worked on other UEFI based systems R710/720/730, etc based systems. This is the first system that I've run into this with. We will try booting the installation media, because we're also including the updates repository as part of the installation and perhaps this is causing the issue.