Skip to content

[BUG] BTRFS /var/cache subvolume busy on shutdown — system hangs, requires hard reset (7.1.1-2 regression from 7.0.12) #895

Description

@ur3ley

BUG: /var/cache BTRFS subvolume busy on shutdown — 7.1.1-2-cachyos

Summary

After upgrading from linux-cachyos 7.0.12-1 to 7.1.1-2 (2026-06-20), every single shutdown shows umount: /var/cache: target is busy. In 2 of 7 consecutive boots, this causes a complete shutdown hang — the display goes dark, but the PC stays powered on (fans spinning, LEDs on), requiring a hard reset (hold power button).

System info

Component Detail
Distro CachyOS (rolling)
Kernel 7.1.1-2-cachyos (znver4)
Last working kernel 7.0.12-1-cachyos
systemd 261-1
DE KDE Plasma 6.7.1 / Wayland
CPU AMD Strix Halo (Radeon 8050S/8060S)
NVMe 2× Samsung 990 PRO 4TB Heatsink (FW: 4B2QJXD7)
Filesystem BTRFS RAID0 (data) + RAID1 (metadata) on /dev/nvme0n1p2 + /dev/nvme1n1p1

Kernel command line

BOOT_IMAGE=/@/boot/vmlinuz-linux-cachyos root=UUID=c98b6640-53b8-44d3-b579-abb17b1c564d rw
rootflags=subvol=@ nowatchdog nvme_load=YES zswap.enabled=0 splash loglevel=3
amdgpu.exp_hw_support=1 iommu=pt amd_iommu=on amdgpu.noretry=0 pcie_aspm=off
ttm.pages_limit=29884416 amdgpu.max_allocation_size=114688
nvme_core.default_ps_max_latency_us=0

BTRFS layout

All subvolumes are on the same BTRFS RAID0 volume (UUID=c98b6640...):

Mount point Subvolume
/ @
/home @home
/root @root
/srv @srv
/data @data
/media @media
/var/cache @cache
/var/tmp @tmp
/var/log @log
/var/lib/ollama bind from /data/ollama
/var/lib/fastflowlm bind from /data/fastflowlm

BTRFS usage:

Data,RAID0:   3.79TiB allocated, 3.67TiB used
Metadata,RAID1: 15.00GiB allocated, 10.49GiB used

Evidence

1. Every shutdown since kernel 7.1.1-2 shows the error

7 of 7 boots (Jun 24 — Jun 26, all on kernel 7.1.1-2) show umount: /var/cache: target is busy:

Boot -7  Jun 25 00:29:47  umount[145648]: umount: /var/cache: target is busy.  → HUNG
Boot -6  Jun 25 07:53:41  umount[37818]:  umount: /var/cache: target is busy.  → OK
Boot -5  Jun 25 18:49:09  umount[84641]:  umount: /var/cache: target is busy.  → OK
Boot -4  Jun 26 00:30:01  umount[94665]:  umount: /var/cache: target is busy.  → OK
Boot -3  Jun 26 07:48:05  umount[38485]:  umount: /var/cache: target is busy.  → OK
Boot -2  Jun 26 17:02:08  umount[27443]:  umount: /var/cache: target is busy.  → OK
Boot -1  Jun 26 17:08:46  umount[70567]:  umount: /var/cache: target is busy.  → FAIL + reboot

2. Hung shutdown: systemd never reaches "Reached target Shutdown"

In boot -7 (the first affected boot after the kernel update), the journal ends abruptly:

июн 25 00:29:47 GMKtec-CachyOS systemd[1]: Unmounting /var/cache...
июн 25 00:29:47 GMKtec-CachyOS systemd[1]: Unmounting /var/lib/fastflowlm...
июн 25 00:29:47 GMKtec-CachyOS systemd[1]: Unmounting /var/lib/ollama...
июн 25 00:29:47 GMKtec-CachyOS systemd[1]: Unmounting /var/tmp...
июн 25 00:29:47 GMKtec-CachyOS umount[145648]: umount: /var/cache: target is busy.
    [boot/efi, home containers, media, root, srv, tmp — all unmounted successfully]
июн 25 00:29:47 GMKtec-CachyOS systemd[1]: Failed unmounting /var/cache.
    [var-lib-fastflowlm.mount: Deactivated successfully]
    [JOURNAL ENDS — NO "Reached target Shutdown", NO "System is powering off"]

In contrast, boots that recovered show:

Boot -6: Jun 25 06:57:40  systemd[1028]: Reached target Shutdown.  ← present
Boot -7:                                              ← MISSING (hung)

3. Timing: all subvolumes unmount simultaneously

/var/cache, /var/lib/fastflowlm, /var/lib/ollama, and /var/tmp are all unmounted in the same second (00:29:47). The fact that /var/lib/fastflowlm succeeds immediately after /var/cache fails suggests a VFS refcounting race on the shared BTRFS superblock.

4. Workaround confirms the root cause

Adding LazyUnmount=yes in a systemd drop-in for var-cache.mount:

# /etc/systemd/system/var-cache.mount.d/lazy-unmount.conf
[Mount]
LazyUnmount=yes

This causes systemd to use umount -l (lazy unmount) which detaches the mount point from the VFS namespace immediately, allowing the shutdown sequence to complete. This fully resolves the hang. The fact that a lazy unmount works proves the issue is a VFS refcount problem, not a process holding files open.

5. No process is actually holding the mount

fuser -vm /var/cache during normal operation shows only kernel mount — no user-space process has files open. The busy condition originates from within the kernel VFS/BTRFS layer.

Reproducibility

  • Happens on every shutdown (7/7 boots, 100%)
  • System completely hangs on ~30% of shutdowns (2/7)
  • On the remaining ~70%, systemd eventually recovers but the error is still logged
  • Boot -7 was a 7-hour session with qbittorrent, emby-server, jellyseerr, ollama, fastflowlm running
  • Boot -1 was a 2-minute session after a kernel update — still showed the error

Regression

  • Last known good: linux-cachyos 7.0.12-1 (and all prior 7.0.x)
  • First known bad: linux-cachyos 7.1.1-2 (installed 2026-06-20)
  • Package updated: 2026-06-20T17:58:09+0300
  • First affected boot: ~2026-06-24 (system not rebooted for 4 days after update)

Between 7.0.12 and 7.1.1, the CachyOS kernel adds 15 topic branches including vmalloc-free (bulk page freeing in vfree, vrealloc improvements), mglru enhancements, and cachy MM tuning. These VFS/VMM changes are the likely candidates.

Workaround

# /etc/systemd/system/var-cache.mount.d/lazy-unmount.conf
[Mount]
LazyUnmount=yes

Then systemctl daemon-reload.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions