• [gentoo-user] Strange UEFI boot behaviour

    From Peter Humphrey@21:1/5 to All on Sun Jul 10 16:20:01 2022
    Hello list,

    One of my machines uses bootctl to offer a choice of kernel to boot (I don't use anything else from systemd); it has these files in /boot/loader/entries:

    08-gentoo-5.15.32-r1-rescue.conf
    09-gentoo-5.15.32-r1-rescue.nonet.conf
    30-gentoo-5.18.10.conf
    32-gentoo-5.18.10.nox.conf
    34-gentoo-5.18.10.nonet.conf
    40-gentoo-5.15.41.conf
    42-gentoo-5.15.41.nox.conf
    44-gentoo-5.15.41.nonet.conf

    Until a few days ago, the system offered the kernels cited in those .conf files - in the same order as I've listed them. Also of course in ascending numerical order. Both as expected.

    Now, though, they're offered in precisely the opposite order (with the two other usual options below them as before: Windows and Enter UEFI setup).

    What might have caused this reversal?

    $ cat /boot/loader/entries/30*f
    title Gentoo 5.18.10
    version 5.18.10-gentoo
    linux vmlinuz-5.18.10-gentoo
    initrd intel-uc.img
    options root=/dev/nvme0n1p5 net.ifnames=0 raid=noautodetect pcie_aspm=off

    $ cat /boot/loader/loader.conf
    timeout 5
    default 30-gentoo-5.18.10

    $ ls /boot/vmlinuz-5.18.10-gentoo
    /boot/vmlinuz-5.18.10-gentoo

    $ efibootmgr
    BootCurrent: 0001
    Timeout: 1 seconds
    BootOrder: 0001,0007,0011,0008,0000
    Boot0000* Windows Boot Manager
    Boot0001* Gentoo Linux
    Boot0007* UEFI OS
    Boot0008* Hard Drive
    Boot0011* CD/DVD Drive

    --
    Regards,
    Peter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to All on Sun Jul 10 17:37:00 2022
    On Sunday, 10 July 2022 15:19:08 BST Peter Humphrey wrote:
    Hello list,

    One of my machines uses bootctl to offer a choice of kernel to boot (I don't use anything else from systemd); it has these files in
    /boot/loader/entries:

    08-gentoo-5.15.32-r1-rescue.conf
    09-gentoo-5.15.32-r1-rescue.nonet.conf
    30-gentoo-5.18.10.conf
    32-gentoo-5.18.10.nox.conf
    34-gentoo-5.18.10.nonet.conf
    40-gentoo-5.15.41.conf
    42-gentoo-5.15.41.nox.conf
    44-gentoo-5.15.41.nonet.conf

    Until a few days ago, the system offered the kernels cited in those .conf files - in the same order as I've listed them. Also of course in ascending numerical order. Both as expected.

    Now, though, they're offered in precisely the opposite order (with the two other usual options below them as before: Windows and Enter UEFI setup).

    What might have caused this reversal?

    $ cat /boot/loader/entries/30*f
    title Gentoo 5.18.10
    version 5.18.10-gentoo
    linux vmlinuz-5.18.10-gentoo
    initrd intel-uc.img
    options root=/dev/nvme0n1p5 net.ifnames=0 raid=noautodetect pcie_aspm=off

    $ cat /boot/loader/loader.conf
    timeout 5
    default 30-gentoo-5.18.10

    $ ls /boot/vmlinuz-5.18.10-gentoo
    /boot/vmlinuz-5.18.10-gentoo

    $ efibootmgr
    BootCurrent: 0001
    Timeout: 1 seconds
    BootOrder: 0001,0007,0011,0008,0000
    Boot0000* Windows Boot Manager
    Boot0001* Gentoo Linux
    Boot0007* UEFI OS
    Boot0008* Hard Drive
    Boot0011* CD/DVD Drive

    This is happening if the EFI firmware for some reason has re-scanned the attached block devices to find bootable UEFI images. I've seen something as simple as rebooting with, then without a bootable USB drive causing this.
    Since the images boot order is editable, in your case via bootctl, then it should be a fixable problem.
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmLLACwACgkQseqq9sKV Zxn61g/9E5u6bFCXBTAZjyw/mAkajFgizuQ6P2CLnK/oMlS2WbJCTJGeIYwQxBCs r6dyOxOJ1ewjgu7axEFFpn5wp1GbI3EZMEEjDify54VpJaNo4iKj8iFxQzw1WFYI zGjNyf283A3UkxMEgiVFNWfh21TEfOttX/GSAQWGgQEZ9LPd429A1jYNuIAjK618 DH2ONqaoV00SFQtoB8QixiFnwQUk+K+CMxtGWZg+73fTds7N/LCctC3QLBF3tSdY iodhj7j8ykO68H4MCvjbGcJTA5G3QlLVakGIMOBRuPsyEcqvp8MrQi7NNWmqbjQ2 Sck46gHlXys2UEYizpngbYgofmAi5S6JYdnIuospyz7aencvcKuXTgCMejL/Ehdd jrOlt1ejLj7j/JvIO8QxZ4oy6fyWLMebpAR4Y3ZsqBk02G87ZhrXjU4uN52Do6bL AoQ9t8yOYbr9KXZwbeFPBUPEiJq7sGJzOnZ3KldU2PtFIepQNg0NdtmYdkCzB/n8 QH8TrwVIjLlKMCHwUehMEfFIMYUmZqPjQ0EIOnDWlEq7MjIJVKzVvzWkQH8m0IP2 j1EzgSk7GwvNG8UXpD+h/hTYAPTWPLPeytleHuHhB3/ukbCu1RHhyG/V5fjdT8tF GL9JB6jDCIEVWlxagzzxMMZEhYfJECTEoGND62ER61ZsBa8zxr4=
    =aPhX
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Humphrey@21:1/5 to All on Mon Jul 11 02:30:01 2022
    On Sunday, 10 July 2022 17:37:00 BST Michael wrote:

    This is happening if the EFI firmware for some reason has re-scanned the attached block devices to find bootable UEFI images. I've seen something as simple as rebooting with, then without a bootable USB drive causing this. Since the images boot order is editable, in your case via bootctl, then it should be a fixable problem.

    But, as I said, the order is unchanged, yet the BIOS displays them in reverse order. I think the BIOS is not long for this world, as you will see...

    This machine shows bizarre behaviour in booting as well. Often, as soon as the POST is finished and the BIOS asks which kernel image to hand over to, I have no keyboard or mouse - except for CTRL-ALT-DEL, which does reboot.

    The thing that got me exercised today was Gentoo complaining that it couldn't mount /boot - wrong FS type or...etc. So I had to do something. So today
    (well, yesterday now) I told the BIOS to load its standard optimised defaults, then rebooted, then told it to load my tuned set and rebooted again. Then I booted a SystemRescueCD (because the USB version showed that same no-
    keyboard problem), formatted /boot with FAT32, zapped / then recovered a week- old backup. Then, still in RescCD, a sync and world-update brought the system back.

    Even then, running bootctl remove; bootctl install; replace /boot/loader/ loader.conf; bootctl update - still left no UEFI boot option for the Gentoo system, though it usually does create one. I had to use efibootmgr to create a boot option, then do the bootctl dance again.

    Finally, a bootable, running system.

    Oh, one other thing. This machine has a small unformatted partition before / boot, and gparted on the rescue CD showed me that it had lost its bios_grub flag. Could that account for the wrong FS type error?

    Should I consider re-flashing the BIOS? It's getting on for 10 years old. I did that to another machine once, thereby killing it stone dead.

    --
    Regards,
    Peter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wols Lists@21:1/5 to Peter Humphrey on Mon Jul 11 09:00:02 2022
    On 11/07/2022 01:25, Peter Humphrey wrote:
    Should I consider re-flashing the BIOS? It's getting on for 10 years old. I did
    that to another machine once, thereby killing it stone dead.

    I've flashed bios's and done stuff like that. Yes it's scary, knowing
    you can kill the machine. No if you're careful it's just fine.

    It's when you try and install XP, and THAT kills the bios, that you
    start really worrying ...

    Cheers,
    Wol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to All on Mon Jul 11 17:19:50 2022
    On Monday, 11 July 2022 01:25:00 BST Peter Humphrey wrote:
    On Sunday, 10 July 2022 17:37:00 BST Michael wrote:
    This is happening if the EFI firmware for some reason has re-scanned the attached block devices to find bootable UEFI images. I've seen something as simple as rebooting with, then without a bootable USB drive causing this. Since the images boot order is editable, in your case via bootctl, then it should be a fixable problem.

    But, as I said, the order is unchanged, yet the BIOS displays them in
    reverse order. I think the BIOS is not long for this world, as you will see...

    This machine shows bizarre behaviour in booting as well. Often, as soon as the POST is finished and the BIOS asks which kernel image to hand over to,
    I have no keyboard or mouse - except for CTRL-ALT-DEL, which does reboot.

    Check your BIOS* firmware settings for USB and enable xHCI. Perhaps this setting was toggled to auto, which may not work reliably.

    * I use the word "BIOS" to describe the UEFI firmware menu on a modern MoBo, rather than the legacy CMOS stored BIOS.


    The thing that got me exercised today was Gentoo complaining that it
    couldn't mount /boot - wrong FS type or...etc. So I had to do something.

    You could have checked with fsck.fat to see what the /boot/EFI partition reported, but since you reformatted the ESP it's all new and in working order now.


    So
    today (well, yesterday now) I told the BIOS to load its standard optimised defaults, then rebooted, then told it to load my tuned set and rebooted again. Then I booted a SystemRescueCD (because the USB version showed that same no- keyboard problem), formatted /boot with FAT32, zapped / then recovered a week- old backup. Then, still in RescCD, a sync and
    world-update brought the system back.

    Even then, running bootctl remove; bootctl install; replace /boot/loader/ loader.conf; bootctl update - still left no UEFI boot option for the Gentoo system, though it usually does create one. I had to use efibootmgr to create a boot option, then do the bootctl dance again.

    Finally, a bootable, running system.

    Oh, one other thing. This machine has a small unformatted partition before / boot, and gparted on the rescue CD showed me that it had lost its bios_grub flag. Could that account for the wrong FS type error?

    Yes, it is probable you mixed up legacy BIOS (CSM) Vs UEFI booting. You need to make sure when you boot with Live media you boot in UEFI mode.

    The EFI firmware can be set up to emulate a legacy BIOS configuration, by enabling its Compatibility Support Module (CSM). This setting allows legacy OSs to boot with a conventional MBR boot loader from a GPT disk. The problem which arises on a GPT formatted disk is where to store GRUB's 2nd Stage image. Normally, on a disk with a MBR partition table, the space immediately after
    the MBR on sector 0 contains GRUB's 2nd Stage image. On a GPT disk the first sector is used to store the GPT partition table and therefore GRUB's 2nd Stage image has to be stored somewhere else - in the marked bios_grub partition.

    An EFI MoBo which boots an OS installed in UEFI mode, on a GPT formatted disk, does not require a CSM or a bios_grub flagged partition. I assume you've installed your OSs in UEFI mode and you do not intend to run WinXP on bare metal. In this case, disable CSM.


    Should I consider re-flashing the BIOS? It's getting on for 10 years old. I did that to another machine once, thereby killing it stone dead.

    As you attest some folk have had bad experiences with flashing new firmware on their MoBos. I first check if the new firmware is meant to address any issues which affect my OS and peripherals and if it does, then I go ahead and flash it.
    If the release offers fixes irrelevant to my kit and OS, I leave it alone. I have not yet had a single MoBo fail on me, even after multiple flash operations. As long as the flash operation is not interrupted and the image is the correct image for the hardware, I would think the flash operation should complete successfully without having to J-TAG the chipset. On a 10 year old MoBo I would consider replacing the NVRAM battery prior to (re)flashing. -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmLMTaYACgkQseqq9sKV ZxmBFhAAs+cQKDLUfDUmgIKFzmA8wFAwlplOZNXgBYJ62r3DwwKl33GTaprmVWlw pOpAzrnhmaxz4utZ1r48hgpFTo8doJ+6vFrNq6ZucMruAupA6eapmT96QdAGCaln hGxiwdOq59MsCBg+6h+X56gIKsDyaKUBLsfechCm9FAV2Mvazx/5mVkR/XTZ7Qdc PqHOEvAMu0xv+GHGJ/NI/JtHWTCecs0JGyQz1aZlFqwixp/MMfRp4KAD8lTTV2bp I+KUX0gYeDm32NhKIax8u0IPsBQGQMPsz6ipKlDMykPzYGll2UdLxBj82XxR9w8S UtmhVEb59sQkBdhS4xSgBzWo3QE3nuyXSXa4rs9oGcrCYMDb5J3uHpRuUB8BI8iE YjlaDiHY2W2BQ8eQ8JQAjdxHBipTy2SDtRhUonhO/J03anJSa/mVV285glCAyUTr 6N0uyLZ58DHU3RQPbTYgMl9SGfUc37SOtspug5wOZC4Y0LQg0KjfxSwqvBx7ZSfR Em2muQTpJf/q+6Hj0rz/FboLxOGd6F3ab04TRW/8KB2wkSOV5S1cfgm11Q8rEEzW LGGcGxq063uIufyfBB+kcmIMHkitQkkRy+ubcJBu0PJdHuSnLFLMH09HZPKuFRm+ 0AMrWFcs2sg3yy3nVzwkCSToDR3nCRZg12uqAXbwfs4JIfvVHyM=
    =J7pW
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Humphrey@21:1/5 to All on Tue Jul 12 11:10:02 2022
    On Monday, 11 July 2022 17:19:50 BST Michael wrote:

    Check your BIOS* firmware settings for USB and enable xHCI. Perhaps this setting was toggled to auto, which may not work reliably.

    That was a good idea, Michael. Indeed, some Auto settings had crept in. I haven't yet tried booting other media though: that's for later today.

    8

    An EFI MoBo which boots an OS installed in UEFI mode, on a GPT formatted disk, does not require a CSM or a bios_grub flagged partition.

    I didn't know that, but CSM was disabled anyway.

    8

    On a 10 year old MoBo I would consider replacing the
    NVRAM battery prior to (re)flashing.

    That's a good idea - thanks.

    --
    Regards,
    Peter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Humphrey@21:1/5 to All on Wed Jul 13 16:30:01 2022
    On Tuesday, 12 July 2022 10:02:31 BST Peter Humphrey wrote:
    On Monday, 11 July 2022 17:19:50 BST Michael wrote:
    Check your BIOS* firmware settings for USB and enable xHCI. Perhaps this setting was toggled to auto, which may not work reliably.

    That was a good idea, Michael. Indeed, some Auto settings had crept in. I haven't yet tried booting other media though: that's for later today.

    ...and I can now boot a USB SysRescCD. Thanks again.


    --
    Regards,
    Peter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)