I have searched the existing open and closed issues
I run 3 dietpi-servers as virtual machines on ESXi. After upgrading these three from 9.8 to 9.9, two of them stuck at the GNU GRUB screen with that minimal bash.
I have not found any related postings. So I kindly ask for some help to repair these servers.
Some further investigation: After mounting the disk of a broken system to a running system I copied the 4 missing files to /boot.
I’m able to boot from the grub bash by manually setting the paths to vmlinuz and initrd.
Dietpi welcomes me with a red “A reboot is required to finalize a recent kernel update”. Well, a reboot will bring up the grub bash once again. So I booted manuelly once again and hat a look at dietpi-config. It’s kinda suprising that there is no network connection because there are no network adapters found.
However: lspci shows the VMware VMXNET3 Ethernet Controller.
I have a few more dietpi-systems at remote locations running wireguard. I really do not want this to happen there, because this will break some much needed connections.
As I did no more than to start dietpi-update: what could have caused these dead systems and how can I avoid it?
It’s probably not due to the DietPi scripts themselves. I suspect that there were some apt packages that were updated together. You can check that beforehand to see available Debian package updates. Maybe there is one for grub.
I always run apt update; apt upgrade before running dietpi-update. Did that last night as well. All systems are running grub version 2.06-13+deb12u1
But to be honest (well, there is a slight chance to be misunderstood, but I will write it down nevertheless) I would expect that the dietpi-scripts will check on apt or whatsoever updates before bricking a working system by excecuting dietpi-update.
btw: I was able to copy /lib/modules from a running system to the mounted disk from a dead systems. Starting the system from grub bash and running update-grub afterwards makes the system fully bootable, but this will not be an option on remote system, where I have no easy access to the physical or virtual disks.
So I would appreciate an explantion for this behaviour to avoid situations like this.
The kernel you copied to the system does not match the one installed on the system. Can you show the full content of /boot and /lib/modules on the broken systems?
… ah, as you copied the modules manually as well, please also show /var/lib/dpkg/info/linux-image-* files, to doubtlessly see which kernel package was really installed on which system.
running apt update to check for available Debian packages
update these packages running apt upgrade
only afterwards dietpi scrips are updated.
Basically these are 2 different steps done by our update script. Theoretically steps 1 and 2 could be carried out manually beforehand. Including a reboot. This will ensure all is working before updating our scripts.
Okay, makes sense then. There is nothing in dietpi-update which would remove a kernel package on x86 systems, outside of what APT itself might do, for whatever reason. So I guess the APT upgrades you did before the DietPi update resulted in a purged kernel already. Or it was an auto-removal, if the packages for whatever reason were marked as “auto” installed.
But there are logs from the DietPi update:
cat /var/tmp/dietpi/logs/dietpi-update.log
I guess you have no logs from the APT upgrades you did before the DietPi update?
Can you show the APT repositories you use?
for i in /etc/apt/sources.list{,.d/*.list}; do echo "$i:"; cat "$i"; done
I had a look at .bash_history to double check on the very last commands before dietpi-update
exit
apt update; apt list --upgradeable
apt upgrade
reboot
dietpi-update
The system came from a clean reboot, which actually means there has been a working kernel after the last apt upgrade and before running dietpi-update.
From dietpi-update.log:
[ INFO ] DietPi-Patch | Patching to DietPi v9.9...
[ OK ] DietPi-Patch | Patched to DietPi v9.9
[ INFO ] DietPi-Update | APT autopurge, please wait...
Reading package lists...
Building dependency tree...
Reading state information...
The following packages will be REMOVED:
linux-base* linux-image-6.1.0-27-amd64*
0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
After this operation, 409 MB disk space will be freed.
(Reading database ... 18411 files and directories currently installed.)
Removing linux-image-6.1.0-27-amd64 (6.1.115-1) ...
W: Removing the running kernel
/etc/kernel/postrm.d/zz-update-grub:
Generating grub configuration file ...
Warning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_OS_PROBER documentation entry.
done
Removing linux-base (4.9) ...
(Reading database ... 13466 files and directories currently installed.)
Purging configuration files for linux-base (4.9) ...
Purging configuration files for linux-image-6.1.0-27-amd64 (6.1.115-1) ...
[ OK ] DietPi-Update | APT autopurge
[ OK ] DietPi-Update | systemctl daemon-reload
[ OK ] DietPi-Update | Incremental patching to v9.9.0 completed
Whatever may have caused the kernel packages to be marked, I have no idea why dietpi-update runs an autopurge without any notice and without checking if there is reasonable chance the system will come up after dietpi-update.
btw why does dietpi-backup needs to run apt? I mean: it’s a backup of the current state; why does it need to download anything from the repos before saving local files to a (mostly) local backup location? I’m sure there has to be a reason, but I’m not getting it because this will stop on any system with currently unavailable internet access.
Repositories
/etc/apt/sources.list:
deb https://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware
deb https://deb.debian.org/debian/ bookworm-updates main contrib non-free non-free-firmware
deb https://deb.debian.org/debian-security/ bookworm-security main contrib non-free non-free-firmware
deb https://deb.debian.org/debian/ bookworm-backports main contrib non-free non-free-firmware
/etc/apt/sources.list.d/dietpi.list:
deb https://dietpi.com/apt bookworm main
It is possible to check if a package is installed or not without internet access. If something is missing that is mandatory for the backup you want, the script could stop and warn that it couldn’t access a repository for installation. But I don’t see any good reason why a backup script should stop if the stuff you need is already installed and there’s just no internet access available because of a broken line or whatsoever.
So it looks like you or something purged linux-image-amd64, as otherwise linux-image-6.1.0-27-amd64 would not be autopurged, or a newer one would remain installed, like linux-image-6.1.0-29-amd64 is available in the meantime.
On your remaining systems, please check and assure that linux-image-amd64 is and remains installed, when you do APT upgrades, and before you run DietPi update. If you did not uninstall it yourself, you probably should search for whichever installer or other script which might have done this. No DietPi script will ever actively purge linux-image-amd64.
It only needs to do so when rsync is currently not installed. If it is already installed, it does not run APT and does not need any Internet connection.
It does exactly that. And it should have attempted to install rsync if missing, failing on the APT update, due to missing network connection.
This would not cause any problems as long as no one starts apt autopurge. To be honest: I have never read that dietpi-update will do so. I do not even know why it should do so. But I understand: before using dietpi-update one has to analyze dietpi-update in-depth because there is a slight chance, that it maybe runs apt autopurge.
That does make sense, if the selected backup job will need rsync. But as far as I know, dietpi-backup offers backups that do not require rsync to be installed. These backup jobs will not start without a working internet connection, just because dietpi-backup likes to use APT for no reason at all. I don’t understand why dietpi-backup would want to use APT to install rsync, when it is not even required for the selected backup job.
But I think, we can shorten these two topics. Dietpi did it wright. User did it wrong.
This function is available since more than 4 years within dietpi-update. Before it was handled within a different script and just moved from patch file to main script
to remove possibly obsolete DEB packages and to free up disk space. Especially on old systems where /boot is quite smal partition it’s a necessary step.
It’s part of the main function and should be executed each time an update is done, not only in certain situations
all backups should use rsync.
Maybe I have overlooked this, but which dietpi-backup job does not requires rsync? Can you give an example?
It always did that since DietPi exists, same as dietpi-software and probably some other scripts. We see it as a general cleanup step in case any manually installed packages were removed. The question is why one would mark the Linux package as “auto” installed, like as if it was only pulled in as dependency for another package? You are literally the first I know who ever ran into this issue . At least the mystery is solved then: keep all APT packages that you require as “manual” installed. “auto” is only for dependencies, which are not required for anything else than other “manual” marked packages. When APT is used to install packages, those listed are always marked “manual”, only pulled in dependencies, which were not installed before, are marked “auto”. There is no need to ever use apt-mark to change these states, especially not for a kernel package, unless you know exactly what you are doing.
And yes, dietpi-backup uses rsync for all backup and restore actions, and it always did.