Pi Hole DNS service dying

Bug report: bfa7faf1-0e3d-404b-b1dc-fa8da172aa5b

The Pi Hole DNS server seems to periodically die overnight and I can’t seem to figure out why. It seems difficult to troubleshoot because there aren’t many logs in /var/logs and they seem to rotate frequently so I don’t think the necessary log entries would even be there in the morning when I notice.

root@DietPi:~# systemctl status pihole-FTL
● pihole-FTL.service - LSB: pihole-FTL daemon
   Loaded: loaded (/etc/init.d/pihole-FTL; generated; vendor preset: enabled)
   Active: active (exited) since Fri 2019-04-12 00:17:03 PDT; 10h ago
     Docs: man:systemd-sysv-generator(8)
  Process: 7179 ExecStop=/etc/init.d/pihole-FTL stop (code=exited, status=0/SUCCESS)
  Process: 7216 ExecStart=/etc/init.d/pihole-FTL start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/pihole-FTL.service

Apr 12 00:17:02 DietPi systemd[1]: Starting LSB: pihole-FTL daemon...
Apr 12 00:17:02 DietPi pihole-FTL[7216]: Not running
Apr 12 00:17:02 DietPi su[7249]: Successful su for pihole by root
Apr 12 00:17:02 DietPi su[7249]: + ??? root:pihole
Apr 12 00:17:02 DietPi su[7249]: pam_unix(su:session): session opened for user pihole by (uid=0)
Apr 12 00:17:03 DietPi pihole-FTL[7216]: dnsmasq: failed to create listening socket for port 53: Address already in use
Apr 12 00:17:03 DietPi su[7249]: pam_unix(su:session): session closed for user pihole
Apr 12 00:17:03 DietPi systemd[1]: Started LSB: pihole-FTL daemon.

Running systemctl restart pihole-FTL fixes the issue, but I’d like to prevent it from dying in the first place.

I suppose DietPi is not yet guilty :wink: Better to ask @Pi-hole forum.
Note this:

pihole-FTL[7216]: dnsmasq: failed to create listening socket for port 53: Address already in use

See who is holding the port.

Hmm but why is the service even trying to restart in the middle of the night?

I spent a while digging around but can’t for the life of me figure out why the pihole-FTL service is restarting daily at 00:17 daily. Can anyone elucidate?

EDIT: Maybe this?

Hi.

Your edit points to the right direction. You obviously have set (or use the default) logging scheme, that deletes the logfiles hourly. In case of pihole, it removes all entries from dates before the systems current date and then restarts the pihole service. I dont understand the corresponding comments, but obviously the Pihole logs contain statistics data, and a restart avoids possible problems that might occur if they are simply removed.

So, 00:17 is the first time the hourly cronjobs are run on a new day, and all previous days’ log entries are removed, forcing the restart of the Pihole services. I see the same behaviour on my system. As to why it has problems starting, with port 53 already in use, is a different question, can’t help you with that…

Btw: To see older logs, you can either change the Log System setting in dietpi-software, or run

sudo journalctl

to view system logs.

swrobel
Indeed that must be due to DietPi-RAMlog:

  • We need to clear logs regularly, otherwise tmpfs could fill until full.
  • DietPi-Logclear does this every hour via cron job.
  • It clears Pi-hole logs that are not from today. Since today changes after midnight, the first hourly cron job on 0:17 clears the Pi-hole log.
  • Since Pi-hole does not like it when the log file is touched while process is running, we restart it.

So now the question is why your pihole-FTL obviously fails to restart.
Does the logs shortly after 0:17 give some hint: /var/log/pihole.log

Do you have dnsmasq installed? In case it should be purged. Probably a relic from pre-FTLDNS times: G_AGP dnsmasq

This is on a fresh install of DietPi, so I’m pretty sure there’s no reason dnsmasq would’ve been installed:

root@DietPi:~# G_AGP dnsmasq
[  OK  ] Root access verified.
[ INFO ] Not installed, ignoring: dnsmasq
[ INFO ] None of the requested packages are currently installed. Aborting...
[  OK  ] G_AGP dnsmasq

swrobel
Could you please try to manually run the service restarts we do when clearing the log:

systemctl stop pihole-FTL
systemctl stop resolvconf
sleep 1
systemctl start resolvconf
systemctl start pihole-FTL

Also I am not sure if a simple systemctl restart pihole-FTL wouldn’t be sufficient now. resolvconf handling AFAIK is a relic from Pi-hole v3.X times, when pihole-FTL did not have dnsmasq included.

MichaIng that seems to have worked fine. The pi hole service is running & active.

swrobel
Okay so far so good. Since this is exactly what we do after clearing the logs, I am still wondering why it fails there.

Hmm I am thinking how to debug best. You could during a good time of the day remove all lines not from the last day manually:
nano /var/log/pihole.log

systemctl stop pihole-FTL
systemctl stop resolvconf
sleep 1
systemctl start resolvconf
systemctl start pihole-FTL

Then the next day the same time ± one hour, dietpi-logclear should do the same again, so might have a change to see it failing live. However still not sure how to debug then, since the logs are there and not helpful anyway :thinking:.

We could try to replace the lines:

systemctl stop pihole-FTL
systemctl stop resolvconf
sleep 1
systemctl start resolvconf
systemctl start pihole-FTL

with

systemctl restart pihole-FTL

Not sure if this works better, e.g. the resolvconf restart not required. I will also run some tests later and try to replicate the issue.