⚠️ UmbrelOS Update Fails on Large‑Disk Nodes — Emergency‑Mode Boot Loop After Upgrade

:puzzle_piece: The Problem

On UmbrelOS x86 installations using large hard drives (e.g. 8TB), updating from version 1.3 to 1.4 (or likely any future version) causes the system to fail during boot. Specifically, the boot process halts in emergency mode because:

data.mount fails due to timeout.

:warning: Why This Happens

1. UmbrelOS uses a dual-partition A/B upgrade model

  • UmbrelOS applies updates to the inactive root partition (e.g. /dev/sda3) while running from the active one (/dev/sda2).
  • After updating, the system reboots into the new partition.

2. Custom configurations (like /etc/fstab) are not preserved

  • Any manual edits (such as extending mount timeouts or tuning mount options) are not copied to the new partition.
  • So the updated system boots with default systemd settings, including short timeouts for mounting.

3. Large disks take longer to mount

  • Especially with ext4 and journaling, mounting /data on a large disk (e.g., 8TB) can take several minutes.
  • systemd by default allows ~90 seconds before it gives up.
  • Result: UmbrelOS drops into emergency mode on boot.

:fire: When Does This Happen?

  • UmbrelOS x86 on physical hardware
  • Using large HDDs (not sure if this also happens with large SSDs) for /data
  • On every system update (since it boots from the “other” root partition)

Nodes that store /data on very large ext4 drives (multiple TB) may enter emergency mode after an OTA update. The root cause is an insufficient systemd mount‑timeout for /data in the freshly flashed partition created by Umbrel’s A/B (Mender) upgrade process.


1 · What exactly breaks?

  • During the first boot of the new rootfs (e.g. /dev/sda3) systemd waits only ≈90 s for every fstab mount.
  • A multi‑terabyte ext4 partition can easily need several minutes (journal replay, fsck) → data.mount times out → systemd drops to emergency target.
  • Because UmbrelOS images ship with the default fstab, any earlier, manual mount‑timeout tweaks on the current partition (/dev/sda2) never reach the new one.

2 · Why does a custom data.mount unit not work?

  • Injecting a stand‑alone data.mount (with an override) into /etc/systemd/system/ before the upgrade solves the boot‑timeout, but that additional unit changes the checksum / file list Mender expects → mender commit fails with “nothing to commit” (or a rollback) and the web UI reports “update failed”.

Umbrel’s current update pipeline therefore tolerates fstab edits but not extra systemd units.

Technical explanation of the issue

Limited mount time in systemd

By default, systemd waits ~90 seconds to mount entries defined in /etc/fstab. My 8 TB data partition needs more than 90 s to mount fully (due to file‑system size, journal replay, etc.). Once the limit is exceeded, systemd marks the mount as failed and drops to emergency mode.
So we must extend or remove the mount‑timeout for /data to avoid boot failure.

A/B boot scheme in UmbrelOS

UmbrelOS uses two system partitions (A and B) and the Mender updater for safe OTA upgrades. The active partition (say, “A”) runs UmbrelOS 1.3, while the inactive partition (“B”) either holds the previous version or is used to write the upgrade. During an upgrade, Mender installs the new UmbrelOS version onto the inactive partition and sets the bootloader so the next reboot starts from the updated partition. If the new system fails to boot, the A/B design automatically reverts to the previous system, keeping the device from bricking.

Custom changes lost after upgrade

Because the new system partition is written from a predefined image, it doesn’t include manual customizations made on the old partition. The /etc/fstab parameters added in my current UmbrelOS version (to extend the mount‑timeout or mark the mount as non‑critical) aren’t transferred to UmbrelOS update automatically. After updating and rebooting on the new partition, UmbrelOS 1.4 tries to mount the 8 TB partition with default settings (90 s timeout). Again the timeout hits

 [FAILED] Failed to mount /data

and emergency mode is triggered

Previous failed attempts

  • Editing only the active partition’s fstab before upgrading: ineffective, because the upgrade overwrites it on the new partition.
  • Creating a custom systemd unit (/data.mount) on the inactive partition before upgrading: also failed; this can interfere with Mender (e.g., the inactive partition appears in use, or the custom unit is overwritten by the update image).

We therefore need a method to apply the timeout change inside the new UmbrelOS 1.4 partition before its first boot, so that it already has the extended timeout and mounts /data correctly.

-----------------------------------------------------------------------------------------


:white_check_mark: The Solution

Below is a safe method to upgrade UmbrelOS 1.3 → 1.4 (or likely any future version), inserting the necessary fstab change for /data before first boot. This prevents the mount failure and keeps the A/B update process intact.
Back up critical data before editing partitions or upgrading..


:hammer_and_wrench: Step-by-step Manual Fix (before updating)

1. Preparations

Terminal access (SSH or Umbrel web terminal) with sudo privileges.

2. Download the UmbrelOS 1.4 .update file

  • x86_64 (AMD64):
cd ~
wget https://download.umbrel.com/release/1.4.0/umbrelos-amd64.update -O umbrelos-1.4.update

3. Install the update (but do not reboot yet)

sudo mender install ~/umbrelos-1.4.update

Let Mender finish writing the image to the inactive partition; it will prompt that a reboot is required. Stop here, do not reboot. You are still running UmbrelOS 1.3 on the active partition.

4. Identify the inactive partition

sudo lsblk -o NAME,PARTUUID,FSTYPE,SIZE,MOUNTPOINT

Find which partition is mounted as /; the other root‑size partition is inactive (eg. /dev/sda3). Confirm carefully.

This assumes you’re running from /dev/sda2 and will update into /dev/sda3. Adjust if reversed.

5. Mount the inactive partition

sudo mkdir -p /mnt/newroot
sudo mount /dev/sda3 /mnt/newroot   # replace with your inactive root

6. Edit the new partition’s fstab to extend /data timeout

sudo nano /mnt/newroot/etc/fstab

Add or replace the /data line with, e.g.:

/dev/disk/by-partuuid/XXXXXXXX  /data  ext4  nofail,defaults,x-systemd.mount-timeout=300s  0  0

(or use infinity instead of 300s). Save and exit.

7. Verify and unmount

grep "/data" /mnt/newroot/etc/fstab
sudo umount /mnt/newroot

8. Reboot into the updated partition

sudo reboot
# or: sudo rugpi-ctrl system reboot --spare   (if available)

First boot may take several minutes—let /data mount.

Conclusions & recommendations

By editing the fstab of the new partition before its first boot, you preserve the extended mount‑timeout, prevent emergency mode, and let Umbrel’s A/B update process finish successfully. Remember: with dual‑root systems, any system‑file changes on the active partition are not copied automatically to the inactive one, so you must repeat this procedure for every future UmbrelOS upgrade until the project implements a built‑in fix.

If you ever forget and end up in emergency mode, you can still recover by editing fstab via the emergency shell or a live Linux USB, adding x-systemd.mount-timeout or nofail to the /data line, then rebooting

:fire: Suggested long‑term improvements for Umbrel Team

  • Detect large /data devices and ship a generous default (TimeoutSec=300 or infinite).
  • Provide a GUI/CLI toggle for “slow data drive” that patches fstab automatically.

Until such changes land, the pre‑update fstab edit on the inactive rootfs is the safest way to avoid emergency‑mode boot loops on large‑disk Umbrel nodes.

2 Likes