Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boot device detection is not deterministic #1919

Open
3 of 11 tasks
nestire opened this issue Mar 3, 2025 · 5 comments
Open
3 of 11 tasks

Boot device detection is not deterministic #1919

nestire opened this issue Mar 3, 2025 · 5 comments

Comments

@nestire
Copy link
Contributor

nestire commented Mar 3, 2025

Please identify some basic details to help process the report

A. Provide Hardware Details

  1. What board are you using? (Choose from the list of boards here)
    novacustom-v560tu

B. Identify how the board was flashed

  1. How was Heads initially flashed?
    • External flashing
    • Internal-only / 1vyprep+1vyrain / skulls
    • Don't know

C. Identify the rom related to this bug report

  1. Did you download or build the rom at issue in this bug report?

    • [x ] I downloaded it
    • I built it
  2. If you downloaded your rom, where did you get it from?

    • Heads CircleCi
    • Purism
    • Nitrokey
    • Dasharo DTS (Novacustom)
    • Somewhere else (please identify)

    Please provide the release number or otherwise identify the rom downloaded

  3. If you built your rom, which repository:branch did you use?

    • Heads:Master
    • Other (please identify)

Please describe the problem

Describe the bug
If in you have 2 nvme installed in the v56 laptop: same size and vendor, and both nvme are valid boot devices, heads will be switching between these boot devices every reboot in a random way. This creates a lot of different faulty behaviours which are hard to diagnose.

Expected behavior
Always choose the same boot device, warn at OEM-Factory-Reset that there are 2. valid Boot Devices and ask which one is the correct one.

Additional context
my guess is that this fdisk call

fdisk -l 2>/dev/null | grep "Disk /dev/" | cut -f2 -d " " | cut -f1 -d ":" >/tmp/disklist
is to blame not sure if this is also a problem for non nvme setups

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 3, 2025

@nestire I guess the fix would be to mount /boot by partition uuid and not by /dev/ convenient naming scheme, which is the one being deterministic?

@nestire
Copy link
Contributor Author

nestire commented Mar 5, 2025

@tlaurion the naming of the device is not changing, so at one boot it is /dev/nvme1 and on the next boot it is /dev/nvme0 . The problem here seems to be that the order fdisk list the devices in this moment seems to be not fix, so whatever device is at top will be choosen since the function stops with the first boot device. So a simple "sort" in the pipe should probably fix this.
But using the uuid is in general a good practice i guess.

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 5, 2025

@nestire

@tlaurion the naming of the device is not changing, so at one boot it is /dev/nvme1 and on the next boot it is /dev/nvme0.

This doesn't make sense to me. fdisk -l just lists disks, where /dev/* convenient name scheme is populated by kernel as first seen -> assigned.

The problem here seems to be that the order fdisk list the devices in this moment seems to be not fix, so whatever device is at top will be choosen since the function stops with the first boot device. So a simple "sort" in the pipe should probably fix this.

fdisk -l already sorts convenient names alphanumerically, so kernel first discovered disk will be /dev/nvme0, second will be /dev/nvme1. The problem here is that across reboots, those convenient names change, and nvme0 <-> nvme1 is assigned to randomly found drives by kernel

But using the uuid is in general a good practice i guess.

Grub works like that nowadays, so that dev passed is per uuid, same for fstab. blkid does the mapping between covenient device name and uuid. Not sure how to properly refactor codebase to use uuid instead of convenient device names though.

heads/initrd/etc/functions

Lines 1129 to 1170 in d4c4e56

detect_boot_device() {
TRACE_FUNC
local devname
# unmount /boot to be safe
cd / && umount /boot 2>/dev/null
# check $CONFIG_BOOT_DEV if set/valid
if [ -e "$CONFIG_BOOT_DEV" ] && mount_possible_boot_device "$CONFIG_BOOT_DEV"; then
# CONFIG_BOOT_DEV is valid device and contains an installed OS
return 0
fi
# generate list of possible boot devices
fdisk -l 2>/dev/null | grep "Disk /dev/" | cut -f2 -d " " | cut -f1 -d ":" >/tmp/disklist
# Check each possible boot device
for i in $(cat /tmp/disklist); do
# If the device has partitions, check the partitions instead
if device_has_partitions "$i"; then
devname="$(basename "$i")"
partitions=("/sys/class/block/$devname/$devname"?*)
else
partitions=("$i") # Use the device itself
fi
for partition in "${partitions[@]}"; do
partition_dev=/dev/"$(basename "$partition")"
# No sense trying something we already tried above
if [ "$partition_dev" = "$CONFIG_BOOT_DEV" ]; then
continue
fi
# If this is a reasonable boot device, select it and finish
if mount_possible_boot_device "$partition_dev"; then
CONFIG_BOOT_DEV="$partition_dev"
return 0
fi
done
done
# no valid boot device found
echo "Unable to locate /boot files on any mounted disk"
return 1
}

Somewhat, CONFIG_BOOT_DEV is not enough since it refers to something non-deterministic if multiple nvme present with each having a distinct boot device. Seems like we would need to change CONFIG_BOOT_DEV content to its UUID and make sure this doesn't make regression in all places its used.

#903 proposed something similar, using labels instead of UUID, where afba8f7 being associated commit suggest UUID.

Thoughts @nestire ?

@nestire
Copy link
Contributor Author

nestire commented Mar 10, 2025

Hi,

I think there is a misunderstanding. What happen was that the Boot device changed from /dev/nvme0 to /dev/nvme1 in between the boots, the content of /dev/nvme0 of /dev/nvme1 was always the same so the kernel always assigned the same hardware to these names.
Because of that my guess was this 'fdisk -l' my understanding is also that this should be sorted, but it somehow did not always produce the same expected result "/dev/nvme0" but 50% of the time "/dev/nvme1" in the recovery shell always both devices where present.
Other explanation could be that the /dev/nvme0 was not mountable/readable during the boot of heads do to a race condition, where then the mount of /dev/nvme1 succeed and because of that this was choosen instead /dev/nvme0

Unfortunately the device where we have seen this is gone now but I will try to reproduce this on another device, to be 100% sure.

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 10, 2025

Hi,

I think there is a misunderstanding. What happen was that the Boot device changed from /dev/nvme0 to /dev/nvme1 in between the boots, the content of /dev/nvme0 of /dev/nvme1 was always the same so the kernel always assigned the same hardware to these names. Because of that my guess was this 'fdisk -l' my understanding is also that this should be sorted, but it somehow did not always produce the same expected result "/dev/nvme0" but 50% of the time "/dev/nvme1" in the recovery shell always both devices where present. Other explanation could be that the /dev/nvme0 was not mountable/readable during the boot of heads do to a race condition, where then the mount of /dev/nvme1 succeed and because of that this was choosen instead /dev/nvme0

Unfortunately the device where we have seen this is gone now but I will try to reproduce this on another device, to be 100% sure.

@nestire this will need replication. busybox fdisk will report /dev/nvme0 before /dev/nvme1, output is ordered alphanumerically.

If the order unstable between reboots, we would need to get away of friendly devices names in codebase and replace with UUID.
This minimally needs to be replicated first and properly diagnosed, before I try to replicate this under QEMU and start working on a fix. If two nvme drives are provisioned by OEM or end user install another OS on a second nvme, following your description, there is no ordering guaranteed as opposed to /dev/sd* which would be a new problem requiring fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants