Trying to "poweroff" results in the system hanging with a failure to unmount

Creating a bug report/issue

I have searched the existing open and closed issues

Required Information

  • DietPi version | v9.5.1
  • Distro version | bookworm
  • Kernel version | Linux DietPiVM 6.1.0-21-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03) x86_64 GNU/Linux
  • Architecture | amd64
  • SBC model | Native PC (x86_64) VM running on Proxmox
  • Power supply used | N/A
  • SD card used | N/A

Additional Information (if applicable)

  • Software title | SAMBA
  • Was the software title installed freshly or updated/migrated? Origainally fresh, since has been updated to bookworm.
  • Can this issue be replicated on a fresh installation of DietPi? Yes, fresh setup has same issue.
    ā† If you sent a ā€œdietpi-bugreportā€, please paste the ID here ā†’
  • Bug report ID | 2f450358-b5ed-4bf6-95eb-c874f94faea6

Steps to reproduce

  1. Issue ā€œpoweroffā€ commend within VM. Or issue Shutdown to VM command from Proxmox.
  2. Witness the error: [FAILED] Failed unmounting mnt-mysambashare.mount - /mnt/mysambashare.
    (actual name of share changed for this post).

Expected behaviour

  • VM shuts down, or reboots, without issue.

Actual behaviour

  • error: [FAILED] Failed unmounting mnt-mysambashare.mount - /mnt/mynetworkshare. appears in console and VM hangs until forcefully reset.

Extra details

  • If I run the below command the VM will shutdown fine. But this forces the unmount and has to be carried out via the console - I cannot issue the command as part of the shutdown process from Proxmox.
  • umount -f -l /mnt/mysambashare && poweroff

probably an application that is still trying to access the network share, which prevents regular unmounting.

How does following behave

umount /mnt/mysambashare

Another possibility would be that the network interfaces are down earlier, before the network share has been disconnected correctly. :thinking:

Good shout, trying the umount command I get thisā€¦
umount: /mnt/mysambashare: target is busy

I stopped qbittorrent (systemctl stop qbittorrent) which then allowed me to unmount the share.
I had to umount it a couple of times before it actually unmounted (it would still show files when ls-ing the directory).

So, it appears the poweroff or reboot commands are trying to unmount the busy mount before the services/application using it are stopped.

I wouldnā€™t know where to even start to remedy that.

theoretically we could setup persistent logging to check order of events during shutdown

persistent system logs:

dietpi-software uninstall 103 # uninstalls DIetPi-RAMlog
mkdir /var/log/journal # triggers systemd-journald logs to disk
reboot # required to finalise the RAMlog uninstall

Then you can check system logs via:

journalctl

which will then show as well logs from previous boot sessions. To limit the size, you can additionally e.g. apply the following:

mkdir -p /etc/systemd/journald.conf.d
cat << '_EOF_' > /etc/systemd/journald.conf.d/99-custom.conf
[Journal]
SystemMaxFiles=2
MaxFileSec=7day
_EOF_

This will limit logs to 14 days split across two journal files, so that with rotation you will always have between 7 and 14 days of logs available.

This is great advice. Thank you very much for helping me here.
OK. Iā€™ve done the above and can see the logs using journalctl just fine.
I made sure to reboot using the command systemctl stop qbittorrent && reboot to ensure a clean reboot and no hangs with the mount error.

With the above setup Iā€™ve tried the poweroff command again to reproduce the issue and have to reset the VM to get it going again.
Will the reset still allow enough logs to capture whatā€™s going wrong?
Where do I poke about now to try and figure this out?

Iā€™ve pulled out the relevant part of the logs (well, according to the timestamps) from when I entered the poweroff command, then eventually resetting the VM and up to the time of boot.

It looks as though the unmounting is attempted before the services are stopped. But thatā€™s where my insight stops. I have no idea what to do with this or how to resolve it Maybe if thereā€™s a way of having DietPi unmount all mount points after applications/service have stopped? I feel that would work as the VM cycles fine if I stop qbittorrent first.

Iā€™ve not bothered changing the mount name in the log - I donā€™t think thereā€™s anything identifiable in there.

log-extract.txt (11.9 KB)

@MichaIng
Any idea why system is trying to unmount shares before app services are stopped?

If I am not mistaken, the process is shut down in the reverse order to the start. Iā€™m probably not right.

I have no idea myself. Iā€™ve assumed the same - I type ā€œpoweroffā€ and the system manages the clean shutdown itself.
Hopefully I can figure this out as itā€™s the only thing preventing me automatically rebooting my Proxmox server during maintenance - I have to manually power off this one VM first.

Yes, systemd stops all units (services, mounts, targets, ā€¦) in reversed order on shutdown. However, not everything is strictly ordered. If qbittorrent.service and mnt-mysambashare.mount have no direct or indirect Before=/After= ordering among each other, it is perfectly possible that the share is tried to be unmounted before qBittorrent is stopped.

However, they have an indirect ordering:

  • (Our) qbittorrent.service has After=remote-fs.target
  • CIFS and NFS shares (should) have Before=remote-fs.target

@polite-garlic
Can you verify this:

systemctl show -p Before,After qbittorrent.service mnt-dagobah_downloads.mount

Indeed, the share somehow is tried to be unmounted much earlier, not sure why. Letā€™s see what above output shows. And qbittorrent.service seems to hang shutting down, finished finally after remote-fs.target:

Jun 23 12:21:28 DietPiVM systemd[1]: Stopping qbittorrent.service - qBittorrent (DietPi)...
Jun 23 12:21:28 DietPiVM qbittorrent-nox[609]: QObject::killTimer: Timers cannot be stopped from another thread
Jun 23 12:21:28 DietPiVM qbittorrent-nox[609]: QObject::~QObject: Timers cannot be stopped from another thread
Jun 23 12:21:28 DietPiVM systemd[1]: Stopped target remote-fs.target - Remote File Systems.
Jun 23 12:21:28 DietPiVM systemd[1]: Stopped target remote-fs-pre.target - Preparation for Remote File Systems.
Jun 23 12:21:43 DietPiVM qbittorrent-nox[609]: WebUI will be started shortly after internal preparations. Please wait...
Jun 23 12:21:43 DietPiVM qbittorrent-nox[609]: ******** Information ********
Jun 23 12:21:43 DietPiVM qbittorrent-nox[609]: To control qBittorrent, access the WebUI at: http://localhost:1340
Jun 23 12:21:43 DietPiVM systemd[1]: qbittorrent.service: Deactivated successfully.

This ā€œWebUI will be started shortlyā€ is weird particularly. Maybe it is related to the attempted unmount.

Please also show the content of both units:

systemctl cat qbittorrent.service mnt-dagobah_downloads.mount
1 Like

No worries, thanks for your time helping with this.

Output of systemctl show -p Before,After qbittorrent.service mnt-dagobah_downloads.mount ā€¦

Before=multi-user.target shutdown.target
After=-.mount network-online.target systemd-journald.socket systemd-remount-fs.service system.slice sysinit.target basic.target

Before=umount.target
After=remote-fs-pre.target network-online.target network.target -.mount system.slice systemd-journald.socket mnt-dagobah_downloads.automount

Output of systemctl cat qbittorrent.service mnt-dagobah_downloads.mount ā€¦

# /etc/systemd/system/qbittorrent.service
[Unit]
Description=qBittorrent (DietPi)
Documentation=man:qbittorrent-nox(1)
Wants=network-online.target
After=network-online.target

[Service]
User=qbittorrent
UMask=002
LogsDirectory=qbittorrent
ExecStart=/usr/bin/qbittorrent-nox

[Install]
WantedBy=multi-user.target

# /run/systemd/generator/mnt-dagobah_downloads.mount
# Automatically generated by systemd-fstab-generator

[Unit]
Documentation=man:fstab(5) man:systemd-fstab-generator(8)
SourcePath=/etc/fstab

[Mount]
What=//dagobah/Downloads
Where=/mnt/dagobah_downloads
Type=cifs
Options=cred=/var/lib/dietpi/dietpi-drive_manager/mnt-dagobah_downloads.cred,iocharset=utf8,uid=dietpi,gid=dietpi,file_mode=0770,dir_mode=0770,vers=3.1.1,nofail,noauto,x-systemd.automount

Ah I see, probably you installed qBittorrent before I consequently added After=remote-fs.target to all of them. Please try this:

cat << '_EOF_' > /etc/systemd/system/qbittorrent.service
[Unit]
Description=qBittorrent (DietPi)
Documentation=man:qbittorrent-nox(1)
Wants=network-online.target
After=network-online.target remote-fs.target

[Service]
User=qbittorrent
UMask=002
LogsDirectory=qbittorrent
ExecStart=/usr/bin/qbittorrent-nox

[Install]
WantedBy=multi-user.target
_EOF_
systemctl daemon-reload

Thanks for the help. Iā€™ve updated the file as per your suggestion and really got my hopes up there!
But, alasā€¦ I still get the dreaded ā€œFAILEDā€ error with a stuck mount on poweroff or reboot.

Howeverā€¦ Iā€™ve just setup another DietPi VM (I swear, I tried this already!) from scratch. Created the network mount via Drive Manager, then installed qBittorrent, and tested with reboots and poweroffs again.

The fresh VM reboots without any issue at all! Go figure.

I donā€™t see an easy way of exporting/importing the qBittorrent settings so Iā€™ve decided to setup a new VM from scratch over the weekend.

Itā€™ll be better than asking you fine people to keep suggesting troubleshooting tasks for me.

Thanks again, all!

Did you try another reboot? I am not entirely sure whether a changed service file has an effect on the right next reboot already (despite systemctl daemon-reload).

One thing that did make me wonder is that the CIFS mount has After=remote-fs-pre.target, but not Before=remote-fs.target. I would have expected both. Can you check this again on the VM?

I did indeed try another reboot. In fact I performed systemctl stop qbittorrent && reboot to make sure it rebooted cleanly before I attempted a poweroff or reboot after.

Needless to say the clean reboot was fine, but the subsequent plain commands resulted in the same failure.

I would have expected both. Can you check this again on the VM?

Iā€™m not sure what you mean here.

On a side note, my impatience got the better of meā€¦
After stopping qBittorrent on the original VM, I ran scp -r /home/qbittorrent root@NEWVM:/home/ to copy over the qBittorrent config then installed qBittorrent on the new VM. It came up fine with all my settings and tested OK. The reboots and poweroffs behave fine too.

Iā€™ve binned it now Iā€™ve confirmed the test, but Iā€™ll set it up again over the weekend with the VPN and everything else, and reconfigure the IP and hostname to match the original, problem, VM.

1 Like

I tested it as well on some VMs here, and neither CIFS/SMB nor NFS mounts have Before=remote-fs.target. And indeed this means that any service sorting itself After=remote-fs.target is not assured to stop before network mounts are tried to be unmounted. Here logs from a Bullseye and Bookworm VM, which both show this pretty clearly:

root@VM-Bullseye:~# journalctl -u qbittorrent -u mnt-samba.mount -u mnt-nfs_client.mount -u remote-fs.target -u remote-fs-pre.target
Jun 29 16:22:48 VM-Bullseye systemd[1]: Unmounting /mnt/nfs_client...
Jun 29 16:22:48 VM-Bullseye systemd[1]: Unmounting /mnt/samba...
Jun 29 16:22:48 VM-Bullseye qbittorrent-nox[2785]: Catching signal: SIGTERM
Jun 29 16:22:48 VM-Bullseye qbittorrent-nox[2785]: Exiting cleanly
Jun 29 16:22:48 VM-Bullseye systemd[1]: Stopping qBittorrent (DietPi)...
Jun 29 16:22:48 VM-Bullseye systemd[1]: mnt-samba.mount: Succeeded.
Jun 29 16:22:48 VM-Bullseye systemd[1]: Unmounted /mnt/samba.
Jun 29 16:22:48 VM-Bullseye qbittorrent-nox[2785]: ******** Information ********
Jun 29 16:22:48 VM-Bullseye qbittorrent-nox[2785]: To control qBittorrent, access the Web UI at http://localhost:1340
Jun 29 16:22:48 VM-Bullseye systemd[1]: qbittorrent.service: Succeeded.
Jun 29 16:22:48 VM-Bullseye systemd[1]: Stopped qBittorrent (DietPi).
Jun 29 16:22:48 VM-Bullseye systemd[1]: Stopped target Remote File Systems.
Jun 29 16:22:54 VM-Bullseye systemd[1]: mnt-nfs_client.mount: Succeeded.
Jun 29 16:22:54 VM-Bullseye systemd[1]: Unmounted /mnt/nfs_client.
Jun 29 16:22:54 VM-Bullseye systemd[1]: Stopped target Remote File Systems (Pre).
root@VM-Bookworm:~# journalctl -u qbittorrent -u mnt-samba.mount -u mnt-nfs_client.mount -u remote-fs.target -u remote-fs-pre.target
Jun 29 16:22:46 VM-Bookworm systemd[1]: Unmounting mnt-nfs_client.mount - /mnt/nfs_client...
Jun 29 16:22:46 VM-Bookworm systemd[1]: Unmounting mnt-samba.mount - /mnt/samba...
Jun 29 16:22:46 VM-Bookworm systemd[1]: Stopping qbittorrent.service - qBittorrent (DietPi)...
Jun 29 16:22:46 VM-Bookworm qbittorrent-nox[2272]: Catching signal: SIGTERM
Jun 29 16:22:46 VM-Bookworm qbittorrent-nox[2272]: Exiting cleanly
Jun 29 16:22:46 VM-Bookworm qbittorrent-nox[2272]: WebUI will be started shortly after internal preparations. Please wait...
Jun 29 16:22:46 VM-Bookworm qbittorrent-nox[2272]: ******** Information ********
Jun 29 16:22:46 VM-Bookworm qbittorrent-nox[2272]: To control qBittorrent, access the WebUI at: http://localhost:1340
Jun 29 16:22:46 VM-Bookworm systemd[1]: mnt-nfs_client.mount: Deactivated successfully.
Jun 29 16:22:46 VM-Bookworm systemd[1]: Unmounted mnt-nfs_client.mount - /mnt/nfs_client.
Jun 29 16:22:46 VM-Bookworm systemd[1]: qbittorrent.service: Deactivated successfully.
Jun 29 16:22:46 VM-Bookworm systemd[1]: Stopped qbittorrent.service - qBittorrent (DietPi).
Jun 29 16:22:46 VM-Bookworm systemd[1]: mnt-samba.mount: Deactivated successfully.
Jun 29 16:22:46 VM-Bookworm systemd[1]: Unmounted mnt-samba.mount - /mnt/samba.
Jun 29 16:22:46 VM-Bookworm systemd[1]: Stopped target remote-fs.target - Remote File Systems.
Jun 29 16:22:46 VM-Bookworm systemd[1]: Stopped target remote-fs-pre.target - Preparation for Remote File Systems.

Both same pattern: Nothing prevents the network shares from being unmounted first, notably before Stopped target Remote File Systems/remote-fs.target. Iā€™d call this a bug in systemd or probably Debian.

ā€¦ checking the man page: systemd.mount

  • Network mount units automatically acquire After= dependencies on remote-fs-pre.target, network.target, plus After= and Wants= dependencies on network-online.target, and a Before= dependency on remote-fs.target, unless one or more mount options among nofail, x-systemd.wanted-by=, and x-systemd.required-by= is set.

Indeed we set the nofail mount option for everything but root and boot mounts. Because otherwise the boot squence would be stopped if you have a drive not attached or the network share is not available. Though we use noauto,x-systemd.automount, so that the mounts are not part of the initial boot sequence. Instead they are mounted on first attempt to access the mount point. Probably, because of this, even a failing mount has no effect on the boot sequence. systemd automount however has another nasty effect: It tries for 90 seconds to mount the drive, and holds everything else in the meantime. For this reason, I was thinking to remove it for network shares. They have After=network-online.target, and on DietPi (compared to plain Debian), this does have some more meaning. It however still not assures that Internet access is really fully there, if e.g. DHCP times out and is repeated later.

I need some testing.

Blimey, thatā€™s some intensive testing!
This came through at a weirdly perfect time - I was just this second setting out my steps for creating a fresh DietPi VM for my qBittorrent and SMB share setup!

From reading your summary, does that mean that - even though my tests show the unmount behaving fine on my test VM - my fresh setup may still, eventually exhibit the same problemā€¦ maybe later?

Yes, nothing prevents the issue from happening. It is somewhat random then whether qBittorrent stops first, or whether the network shares are attempted to be unmounted first. And of course it depends on whether qBittorrent is actually accessing the mount (downloading something) while you are shutting down.

Did further tests:

  • I removed nofail, which added After=remote-fs.target and solved the too potential shutdown issue.
  • I added an ls /mnt/samba to our dietpi-preboot.service, and indeed it hangs the boot process for 90 seconds ā€¦ ehm, somehow much longer, both VMs still hang ā€¦ this is weird: when running the same command on console, it hangs for exactly 90 seconds, but where this hurts much more, during boot, it hangs much longer ā€¦ sending the post, Iā€™ll let them run now and see whether they ever finish boot :smile:. However, this problem is unrelated to nofail.
  • Next test is to replace x-systemd.automount with auto (removing noauto), and see which other services and targets are actually failing because of this, and whether they are relevant for bootup.

The missing After=*-fs.target entry is btw the same for local drives and local-fs.target: systemd.mount
For boot, this somehow makes sense, since they are not mounted on first mountpoint access, regardless how they are ordered. But it does have relevancy on shutdown. So IMO this behaviour is faulty, and the After=*-fs.target entry should be added in any case.

EDIT: Okay, I am not sure why, but both VMs did never boot, i.e. seem to try indefinitely to mount the non-existing remote shares. However, it is probably a rare incidence that a blocking unit of the boot sequence tries to access a network mountpoint. Funny to see active systemd timers triggering their services, but the main boot sequence is just on hold forever:

EDIT2: Another test, removing nofail,noauto,x-systemd.automount entirely, i.e. having network shares mounted via boot sequence.

  • This solved the shutdown issue, i.e. assured that qBittorrent is stopped first.
  • When a mount fails, e.g. because the server is down, or DHCP did not bring up full network access sufficiently fast, the remote-fs.target however fails. And since the Samba mount takes 6 seconds to fail (not sure whether this is processing or some timeout), it delays all units which order themselves after this target by 6 seconds, which includes multi-user.target and the local login prompt. The beauty of x-systemd.automount here is that remote-fs.target is not delayed at all, but those services/processes which really access a particular mountpoint, trigger it to be mounted, the exact time they attempt to access. A much more specific and clever method which allows best parallelisation. If only this nasty boot hang would not be possible.
  • I re-added noauto,x-systemd.automount, which solved the delay and indeed remote-fs.target was reached immediately.
  • Again unrelated to the issue: I added an entirely new service which tries to access the failing mount. And boot still fails entirely for both systems, even that nothing requires or orders itself after this new service. It really is systemd automount somehow stopping any systemd action while trying to mount this non-existing/failing share. And again, there is no timeout, all I can do is reset the VMs.

Gaaah, and Iā€™ve just hit the issue again too. Havenā€™t even installed qBittorrent on my new VM yet. Changed the IP and hostname, issued the reboot command, thenā€¦

:frowning_face:

Oh well. Good luck with your investigation - I wasnā€™t expecting anyone to be quite so determined!

Wow, good work.

So you fixed the failing shutdown, but at the expense of a borked boot sequence? Am I reading that right?

I was mixing two topics, which are not that much related. Actually, the fact that we use x-systemd.automount means that nofail has no effect and can just be removed, solving your issue. I changed the drive manager to not add the nofail option in this case: v9.6 Ā· MichaIng/DietPi@562630a Ā· GitHub
You can just edit your /etc/fstab and remove the nofail from the CIFS mount entry, followed by running systemctl daemon-reload.

I sent a mail to the systemd developer mailing list, since, in case we want to get rid of x-systemd.automount, we might want to re-add nofail. Even when we do not re-add it, a failing remote-fs.target (of the Samba/NFS server is offline or network not fully up yet) causes the potential shutdown issue to happen the exact same way: If one remote fs fails during boot then the other remote fs will be stopped in random order during shutdown Ā· Issue #17221 Ā· systemd/systemd Ā· GitHub

The hanging boot can happen in any case with x-systemd.automount, and I will check back about this with systemd developers, ones the nofail topic has been sorted.