Diagnosing an unresponsive server

I have Dietpi on a small N100 processor computer with an internal SSD (running it headless as a server).

Everything was working fine; have been updating to the latest Debian and latest Dietpi versions since months (if not years).

Suddenly, 2 days ago, the server went unresponsive. I couldn’t SSH into it anymore. I connected a HDMI cable and the screen was black. So I unplugged the power and rebooted it. Everything worked again.

Then this night, the same thing happened again. I am suspecting a hardware problem (SSD?), but I have no idea how to find the logs after a reboot. Is it even possible to see what happened before I reboot? dmesg shows no problems after reboot (only shows current reboot messages of course). I have backups, but if the server was compromised, I don’t want to blindly restore to another SSD.

Thanks for any pointers.

usually logs are gone after reboot as they are not persistent.

to enable persistent system logs:

dietpi-software uninstall 103 # uninstalls DIetPi-RAMlog
mkdir /var/log/journal # triggers systemd-journald logs to disk
reboot # required to finalise the RAMlog uninstall

Then you can check system logs via:

journalctl

which will then show as well logs from previous boot sessions. To limit the size, you can additionally e.g. apply the following:

mkdir -p /etc/systemd/journald.conf.d
cat << '_EOF_' > /etc/systemd/journald.conf.d/99-custom.conf
[Journal]
SystemMaxFiles=2
MaxFileSec=7day
_EOF_

This will limit logs to 14 days split across two journal files, so that with rotation you will always have between 7 and 14 days of logs available.

1 Like