4

I have two servers in Docker Swarm, but when I need to add a third server - I get the result:

Error response from daemon: rpc error: code = 14 desc = grpc: the connection is unavailable

All servers in one network.

What could be the problem?

ChipX
  • 75
  • 2
  • 7

9 Answers9

5

I'd say it's possibly firewall related. Ensure your ports are configured correctly on the third box. From the Docker docs:

Open protocols and ports between the hosts The following ports must be available. On some systems, these ports are open by default.

TCP port 2377 for cluster management communications TCP and UDP port 7946 for communication among nodes UDP port 4789 for overlay network traffic

Rawkode
  • 21,990
  • 5
  • 38
  • 45
4

From official Docker swarm tutorial

The following ports must be open on your docker hosts.

TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes   
UDP port 4789 for overlay network traffic

To enable this ports run the below command on all your docker hosts. kindly follow the digitalocen article for complete steps.

firewall-cmd --add-port=2376/tcp --permanent
firewall-cmd --add-port=2377/tcp --permanent
firewall-cmd --add-port=7946/tcp --permanent
firewall-cmd --add-port=7946/udp --permanent
firewall-cmd --add-port=4789/udp --permanent
sanjaykumar81
  • 435
  • 1
  • 6
  • 13
1

As others have pointed out, closed ports could be one reason. But I've also found a couple of more.

Recent version of Docker is suffering from massive proxy issues:

According to this comment, the fix is "likely" to make it into Docker version 17.11 and it is "considered" to be put in a patch release for 17.09.

All my ports are open and the NO_PROXY hack described in the aforementioned links did not work.

I tried all Docker versions between 17.05 all the way to 17.11.0-ce-rc3, build 5b4af4f with no success which led me to suspect the culprit might be a recent upgrade of Vagrant (I am using 2.0.1) and/or VirtualBox (using 5.1.30). Upgrading either one of these two usually leads to all kinds of random problems. But, instead of downgrading these guys I tried to upgrade the Vagrant boxes I run.

In my two-machine setup, I switched the first node's box to fso/artful64-desktop and the second node's box to fso/artful64 (both version 2017-11-01). To my surprise, this made Docker Swarm work on version 17.10.0-ce and 17.11.0-ce-rc3, build 5b4af4f. Please note that private networking is broken on Vagrant 2.0.1 if you want to use Ubuntu 17.10 boxes lol (can be manually fixed).

Martin Andersson
  • 18,072
  • 9
  • 87
  • 115
1

The error message we were facing was not exactly the same but quite similar:

Error response from daemon: rpc error: code = Unavailable desc = grpc: the connection is unavailable

In our case we added proxy settings to the docker daemon in order to reach docker hub images from behind our corporate proxy. So when trying to docker swarm join a worker to the manager it went to the proxy instead.

Solution: Add the swarm manager to the docker daemon NO_PROXY environment variable and you are good to go. This answer tells you how.

1

More info about it is available in Docker Forum

https://forums.docker.com/t/error-response-from-daemon-rpc-error-code-unavailable-desc-grpc-the-connection-is-unavailable/39066

As other people mentioned, adding an additional port to firewalld resolve the issue

sudo firewall-cmd --add-port=2376/tcp --permanent  
sudo firewall-cmd --add-port=2377/tcp --permanent  
sudo firewall-cmd --add-port=7946/tcp --permanent  
sudo firewall-cmd --add-port=7946/udp --permanent  
sudo firewall-cmd --add-port=4789/udp --permanent
Prashant Lakhera
  • 850
  • 7
  • 13
1

Remember to restart firewall after open the ports

sudo firewall-cmd --add-port=2376/tcp --permanent 
sudo firewall-cmd --add-port=2377/tcp --permanent 
sudo firewall-cmd --add-port=7946/tcp --permanent 
sudo firewall-cmd --add-port=7946/udp --permanent 
sudo firewall-cmd --add-port=4789/udp --permanent

sudo systemctl restart firewalld
Hailin Tan
  • 989
  • 9
  • 7
0

easier one from official docs:

  1. re-init the swarm manager:

    • take down the swarm with docker swarm leave --force
    • re-init with docker swarm init --advertise-addr [ip of the machine, check it with 'docker-machine ls']:2377( 2377 is the port for swarm joins)
  2. then add your the machine to the swarm with docker-machine ssh myvm2 "docker swarm join \ --token <token> \ <ip>:<port>"

Robin Chen
  • 131
  • 1
  • 8
0

Temporary solved by flushing iptables, but was a bad idea!! After that, cloning images didn't work because it didn't find the appropriate iptables chain "docker".

It is indeed a FW issue, but more precisely firewalld (centos7).
Solved the issue by allowing the appropriate ports through firewalld, as mentioned by :
@sanjaykumar81 answer.

AJN
  • 1,196
  • 2
  • 19
  • 47
0

Ensure that the firewalld in systemd machines is allowing the ports mentioned in the docker docs :

The following ports must be available. On some systems, these ports are open by default.

TCP port 2377 for cluster management communications TCP and UDP port 7946 for communication among nodes UDP port 4789 for overlay network traffic

Ensure that the appropriate TCP / UDP ports are enabled

error: desc = "transport: x509: certificate has expired or is not yet valid"

at certain times due to time not in sync between the leader and the worker node , this error could be seen. Using chronyd / ntpd this can be resolved.

enter link description here

deadpooL
  • 11
  • 2