Docker swarm inaccessible

I have a docker swarm set up on a set of Raspberry pi 3/4 and Orange Pi 5 devices under DietPi. This has been functional for months. Today, I updated each node one at a time to the latest version of DietPi. None of the exposed ports from the docker container are now accessible. This includes Portainer, Navidrome, Audiobookshelf, and several others. Opening from a web page provides a timeout and accessing via curl, no response is returned.

Looking at “docker node ls”, “docker ps”, it would appear that everything is running just fine. I have rebooted nodes, demoted and promoted node managers successfully but no change in connectivity status. I’m assuming something with the network layers has broken but unsure how to fix.

I have a separate standalone (non-swarm) Orange Pi 5 that was updated at the same time. It seems to be functioning properly.

Creating a bug report/issue

Required Information

  • DietPi version | cat /boot/dietpi/.version
  • Distro version | echo $G_DISTRO_NAME $G_RASPBIAN
  • Kernel version | uname -a
    Linux opi1 5.10.160-legacy-rk35xx #1 SMP Mon Aug 28 01:21:24 UTC 2023 aarch64 GNU/Linux
  • Architecture | dpkg --print-architecture
  • SBC model | echo $G_HW_MODEL_NAME or (EG: RPi3)
    Orange Pi 5 (aarch64)
  • Power supply used | (EG: 5V 1A RAVpower)
  • SD card used | (EG: SanDisk ultra)
    Sandisk Ultra

Additional Information (if applicable)

  • Software title | (EG: Nextcloud)
  • Was the software title installed freshly or updated/migrated?
    Docker was previously installed in swarm. Issues occurred on latest DietPi update
  • Can this issue be replicated on a fresh installation of DietPi?
    ← If you sent a “dietpi-bugreport”, please paste the ID here →
  • Bug report ID | echo $G_HW_UUID

Steps to reproduce

  1. Have docker set up in swarm mode on multiple nodes
  2. Upgrade to latest dietpi version
  3. Exposed ports for docker containers can are no longer accessible

Expected behaviour

  • Previously installed docker containers should still be accessible via their exposed ports

Actual behaviour

  • Time outs on web browser attempting to access exposed ports

Extra details

  • Only appears to occur in swarm. Standalone docker installation is not having an issue.

I’ve managed to get most of the services back. To do this, I had to ssh into each device and manually restart Docker.

The one service I cannot seem to get running again is Portainer. I’ve gone so far as entirely removing Portainer and re-installing. However, the net result is always 0/1 replicas. It refuses to start.

are you using portainer or portainer agent?

Both. Portainer was on the swarm with agents on each node. Portainer would not start, though the agents appeared to be fine.

I fixed my Portainer. I did a complete wipe of Portainer and all settings. After that, Portainer would now properly replicate.

Still not sure exactly why this happened. I was using the services hosted under Docker just fine. Updated each node one at a time (with enough rest between to ensure the swarm stayed up) to the latest DietPi and while their status appeared fine in Docker, I couldn’t connect to any of the services. Odd issue but it is all cleared up now.

did you stop then rm the old container, then restart with the new :latest image?

I know sometimes if one gets updated by using watchtower I have to go thru each one, blow the old portainer.agent, then rebuild it, especially under the main portainer server…I think it might be a cert change or something…blowing em out, and rebuilding em usually does the trick.

Since there really isn’t much of a persistent config for portainer or agent…they start right up and just do what they are supposed to do

there should have been something in Docker logs why container was not starting

journalctl -u docker.service