Skip to content

Stale root UUID karg sometimes gets injected #2033

@Goz3rr

Description

@Goz3rr

Describe the bug

Hey all,

As the title says we've ran into a very weird bug where the exact same installation media will work fine on a system with a SATA SSD, but fails after the first boot on an otherwise identical system except for an NVME SSD.

The main reason seems to be that coreos-boot-edit injects the wrong root UUID into the kernel arguments. With NVME SSDs with LUKS enabled the UUID of the decrypted partition (dm-0) changes after it is grown during the first boot. With an NVME SSD with LUKS disabled, or a SATA SSD regardless of LUKS state this UUID does not change.

This UUID changing does not seem to affect anything on the "base" fedora-coreos-42.20250901.3.0-live-iso.x86_64.iso image, but our use case requires an offline installation and our devices are usually not connected to the internet. This means we do customize the base ISO to preinstall a few packages (see reproduction steps). Once we have added a certain amount/combination of packages, the issue seems to occur.

Here are a few logs and lsblk -f outputs from various installs for comparisons:

installing the "broken" image on an NVME disk

nvme-boot-broken.log

lsblk output:

$ lsblk -f
NAME        FSTYPE      FSVER LABEL      UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
nvme0n1
├─nvme0n1p1
├─nvme0n1p2 vfat        FAT16 EFI-SYSTEM 7B77-95E7
├─nvme0n1p3 ext4        1.0   boot       585cb74e-6e78-43a7-8df7-8de77e6a84d4  102.9M    64% /boot
└─nvme0n1p4 crypto_LUKS 2     luks-root  c3da1c0a-34bc-4f75-8de2-dac91ff126cd
  └─root    xfs               root       988645bb-c917-4d20-a71d-2e7f0df33413  464.3G     2% /var
                                                                                             /sysroot/ostree/deploy/fedora-coreos/var
                                                                                             /etc
                                                                                             /sysroot
installing the exact same image as above on a SATA disk, where the UUID does not change

sata-boot-notbroken.log

lsblk output:

# lsblk -f
NAME     FSTYPE      FSVER LABEL      UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda
├─sda1
├─sda2   vfat        FAT16 EFI-SYSTEM 7B77-95E7
├─sda3   ext4        1.0   boot       fe5d02e4-c11e-4423-8b29-ad8ba7b06314  102.9M    64% /boot
└─sda4   crypto_LUKS 2     luks-root  c5dea5fc-4ad3-455f-bfc4-60be6331e98c
  └─root xfs               root       22db5ae9-37fc-4836-a795-9d58ba6d7592  113.5G     4% /var
                                                                                          /sysroot/ostree/deploy/fedora-coreos/var
                                                                                          /etc
                                                                                          /sysroot
installing the same image on a NVME drive, but with the ntfs-3g, weston, weston-demo and chromium packages taken out, where the system will continue to work after first boot

nvme-boot-notbroken.log

lsblk output:

# lsblk -f
NAME        FSTYPE      FSVER LABEL      UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
nvme0n1
├─nvme0n1p1
├─nvme0n1p2 vfat        FAT16 EFI-SYSTEM 7B77-95E7
├─nvme0n1p3 ext4        1.0   boot       dd7a33a9-0bb4-4c4f-8e9d-608c7c62ef0c  104.1M    64% /boot
└─nvme0n1p4 crypto_LUKS 2     luks-root  336cabb0-35d7-4e9e-b11b-8de44ab36a8b
  └─root    xfs               root       7399c039-3242-4187-b43c-d6fe381c4b90  465.2G     2% /var
                                                                                             /sysroot/ostree/deploy/fedora-coreos/var
                                                                                             /etc
                                                                                             /sysroot

What stands out to me after several builds of trial and error and comparing logs is that only on NVME drives with LUKS enabled the UUID will change after Ignition OSTree: Grow Root Filesystem has ran. When LUKS is disabled or when using a SATA drive, the UUID does not change after this step.

Excerpts from the first log:
ignition-ostree-growfs.service runs and resizes dm-0 (dff64afa-e35d-4ce2-9887-c3c050306ddd):

[   36.300870] systemd[1]: Starting ignition-ostree-growfs.service - Ignition OSTree: Grow Root Filesystem...
[   36.353463] XFS (dm-0): Mounting V5 Filesystem dff64afa-e35d-4ce2-9887-c3c050306ddd
[   36.361335] XFS (dm-0): Ending clean mount
[   37.513113] ignition-ostree-growfs[4037]: CHANGED: partition=4 start=1050624 old: size=7815168 end=8865791 new: size=999164559 end=1000215182
...
[   38.827230] XFS (dm-0): Unmounting Filesystem dff64afa-e35d-4ce2-9887-c3c050306ddd
[   38.837478] systemd[1]: Finished ignition-ostree-growfs.service - Ignition OSTree: Grow Root Filesystem.

ignition-ostree-transposefs-restore.service runs and reveals dm-0 has received a new UUID (988645bb-c917-4d20-a71d-2e7f0df33413):

[   39.144992] systemd[1]: Starting ignition-ostree-transposefs-restore.service - Ignition OSTree: Restore Partitions...
[   39.173496] ignition-ostree-transposefs[4353]: Restoring rootfs from RAM...
[   39.185350] ignition-ostree-transposefs[4353]: Mounting /dev/disk/by-label/root rw (/dev/dm-0) to /sysroot
[   39.199438] XFS (dm-0): Mounting V5 Filesystem 988645bb-c917-4d20-a71d-2e7f0df33413
[   39.212860] XFS (dm-0): Ending clean mount
[   42.137903] ignition-ostree-transposefs[4386]: changing security context of '/sysroot'
[   43.905756] XFS (dm-0): Unmounting Filesystem 988645bb-c917-4d20-a71d-2e7f0df33413
[   43.976155] systemd[1]: Finished ignition-ostree-transposefs-restore.service - Ignition OSTree: Restore Partitions.

coreos-boot-edit.service uses the UUID from before the resize (dff64afa-e35d-4ce2-9887-c3c050306ddd), resulting in a broken state that will not boot anymore:

[   44.921380] systemd[1]: Starting coreos-boot-edit.service - CoreOS Boot Edit...
...
[   45.024760] coreos-boot-edit[4572]: Injected kernel arguments into BLS: rd.luks.name=c3da1c0a-34bc-4f75-8de2-dac91ff126cd=root root=UUID=dff64afa-e35d-4ce2-9887-c3c050306ddd rw rootflags=prjquota
[   45.025689] coreos-boot-edit[4563]: Prepared rootmap
[   45.086114] coreos-boot-edit[4589]: Relabeled /sysroot//boot/.root_uuid from <no context> to system_u:object_r:boot_t:s0
[   45.094043] coreos-boot-edit[4591]: Relabeled /sysroot//boot/grub2/bootuuid.cfg from <no context> to system_u:object_r:boot_t:s0
[   45.095136] systemd[1]: Finished coreos-boot-edit.service - CoreOS Boot Edit.

Reproduction steps

  1. Build customized base image
  • cosa init --branch stable https://github.com/coreos/fedora-coreos-config
  • Modify src/config/manifests/fedora-coreos.yaml to remove the excludes for plymouth and python3, add a new include for a custom manifest
  • Create custom manifest:
repos:
  - fedora
  - fedora-updates
  - fedora-coreos-pool

packages:
  - plymouth
  - plymouth-plugin-script
  - plymouth-graphics-libs
  - plymouth-plugin-label
  - lm_sensors
  - weston
  - weston-demo
  - chromium
  - podman-compose
  - ntfs-3g
  - google-noto-emoji-fonts
  - dejavu-fonts-all
  • cosa fetch --with-cosa-overrides
  • cosa build
  • cosa osbuild live
  1. Create a minimal butane file (ours just has a user for debugging) and most importantly:
boot_device:
  luks:
    tpm2: true
  1. Compile the butane file and build the image
podman run --interactive --rm --security-opt label=disable \
       --volume ${PWD}:/pwd --workdir /pwd quay.io/coreos/butane:release \
       --pretty --files-dir local-files --strict install.bu > config.ign

podman run --pull=always --privileged --rm \
    -v /dev:/dev -v /run/udev:/run/udev -v .:/data -w /data \
    quay.io/coreos/coreos-installer:release \
    iso customize \
        --dest-ignition config.ign \
        --dest-device /dev/nvme0n1
        --dest-console tty0 \
        --dest-karg-append quiet \
        --dest-karg-append rhgb \
        --live-karg-append quiet \
        --live-karg-append rhgb \
        -o test.iso fedora-coreos-42.20250917.dev.1-live-iso.x86_64.iso
  1. Write test.iso to a flashdrive and install it on device
  2. Reboot device after the first boot completes

Expected behavior

coreos-boot-edit should inject the correct UUID and reboot successfully after first boot

Actual behavior

coreos-boot-edit injects the wrong UUID (seemingly the UUID from before growing the Root Filesystem?) and fails to boot again after the first boot

System details

  • Bare metal
  • Fedora CoreOS 42 stable

Butane or Ignition config

variant: fcos
version: 1.6.0
boot_device:
  luks:
    tpm2: true

kernel_arguments:
  should_not_exist:
    - console=ttyS0,115200
  should_exist:
    - quiet
    - rhgb

passwd:
  users:
    - name: testuser
      ssh_authorized_keys:
        - [...]
      home_dir: /home/testuser
      no_create_home: false
      password_hash: [...]
      groups:
        - wheel
      shell: /bin/bash

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions