Debug Docker Swarm Services

The current stable version of Docker Swarm has an important feature missing, and is the same control over logs available without orchestratrion.

It can be really painful to debug a docker swarm create command that ends in a "task: non-zero exit (1)" kind of error.

But, the docker team is working in a solution, they started developing it when Docker Swarm Mode appeared, and it is already available as an experimental feature, so in this article I will explain how to enable experimental features on Docker Swarm and how to use the new docker service log command.

    $docker service logs px8fv06fft8s
    only supported with experimental daemon

How to enable Docker Daemon experimental features

You can follow the official documentation here , but I will briefly explain the necessary steps.

  1. Create the file /etc/docker/daemon.json with the following content:

     "experimental": true
  2. Restart the docker daemon

systemctl restart docker

  1. Test if the experimental features are enabled:

    docker version -f ‘{{.Server.Experimental}}’


docker service logs is not the only experimental feature being backed for Docker Swarm, there are other very interesting ones like Metrics for Prometheus for basic container, image, and daemon operations.

Using Docker Service Logs

This new feature is very straightforward in it’s use, specially if you are familiarized with debuging problems using logs on non orchestrated docker installations (without the swarm).

After any docker service create ... an ID for this service is shown:

    $docker service create --name jenkins -p 8082:8080 -p 50000:50000 -e JENKINS_OPTS="--prefix=/jenkins" --mount "type=bind,source=/mnt/swarm-nfs/jenkins-fg,target=/var/jenkins_home" --reserve-memory 300m jenkins

So in this example we had created a service to run Jenkins, this service is using NFS for data persistance, and after the command is executed we obtain the ID of the service, that in this case is v6oq1shzs51fw6q8d2bstkd2f

To check the status of the service, you can use the docker service ps command

    $docker service ps jenkins
    ID            NAME              IMAGE           NODE                                          DESIRED STATE  CURRENT STATE                  ERROR                      PORTS
    j33856j3oxt1  jenkins-fg.1      jenkins:latest  Ready          Ready 2 seconds ago                                       
    aaig0ummuhnq   \_ jenkins-fg.1  jenkins:latest  Shutdown       Failed less than a second ago  "task: non-zero exit (1)"  
    fqsp16ixgft3   \_ jenkins-fg.1  jenkins:latest  Shutdown       Failed 10 seconds ago          "task: non-zero exit (1)"  
    kn9c09vy06x1   \_ jenkins-fg.1  jenkins:latest  Shutdown       Failed 11 seconds ago          "task: non-zero exit (1)"  
    t48zk8hmn2ga   \_ jenkins-fg.1  jenkins:latest  Shutdown       Failed 18 seconds ago          "task: non-zero exit (1)"  

Here we can see the run process for the container failed to succeed, we also are informed in which member of the swarm the container was trying to start, but we don’t get any information of why this error happened.

So to find out, we check the logs of the starting process:

    $docker service logs j4kawvh4b41h    | touch: cannot touch ‘/var/jenkins_home/copy_reference_file.log’: Permission denied    | touch: cannot touch ‘/var/jenkins_home/copy_reference_file.log’: Permission denied    | touch: cannot touch ‘/var/jenkins_home/copy_reference_file.log’: Permission denied    | Can not write to /var/jenkins_home/copy_reference_file.log. Wrong volume permissions?    | Can not write to /var/jenkins_home/copy_reference_file.log. Wrong volume permissions?

The logs clearly show that the problem on this example was wrong permissions on a folder that was mounted during the service creation using an NFS filesystem, problem that we can easily solve using chmod or mapping on NFS.