2

This question has been asked many many times before on all types of fora but unfortunately, none of the answers have helped me so far.

I will get right to it.

 OS: RHEL 7.7 Maipo
 Docker Version: Engline/Client 18.09.7

My configuration:

Host 1: IP 169.192.215.74
Host 2: IP 10.210.87.16

**On Host1:**

# netstat -plntu|grep -E "4789|7946|2377"
tcp6       0      0 :::2377                 :::*                    LISTEN      3576/dockerd        
tcp6       0      0 :::7946                 :::*                    LISTEN      3576/dockerd        
udp        0      0 0.0.0.0:4789            0.0.0.0:*                           -                   
udp6       0      0 :::7946                 :::*                                3576/dockerd

# docker swarm init --advertise-addr=169.192.215.74

# docker network create --attachable --driver overlay syndichain

# docker swarm join-token manager


docker swarm join --token SWMTKN-1-2rxp0qlhl8o5n1hlmf0umcj8jdub19t07ndbu7mlodeb1yi4uf-ceopcr39eaaiz6bfqd4kk6ojs 169.192.215.74:2377

Ping Host 2:

# ping 10.210.87.16
PING 10.210.87.16 (10.210.87.16) 56(84) bytes of data.
64 bytes from 10.210.87.16: icmp_seq=1 ttl=119 time=0.804 ms
64 bytes from 10.210.87.16: icmp_seq=2 ttl=119 time=1.49 ms
64 bytes from 10.210.87.16: icmp_seq=3 ttl=119 time=0.646 ms
64 bytes from 10.210.87.16: icmp_seq=4 ttl=119 time=0.664 ms
^C
--- 10.210.87.16 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 0.646/0.901/1.493/0.348 ms


# ip address show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:8d:a0:60 brd ff:ff:ff:ff:ff:ff
    inet 169.192.215.74/24 brd 169.192.215.255 scope global noprefixroute ens192
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:a9:99:bf:c2 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
76: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ab:17:1e:2f brd ff:ff:ff:ff:ff:ff
    inet 172.19.0.1/16 brd 172.19.255.255 scope global docker_gwbridge
       valid_lft forever preferred_lft forever
468: veth49ae88e@if467: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default 
    link/ether 82:e1:f6:1e:72:ec brd ff:ff:ff:ff:ff:ff link-netnsid 1


Bring up two containers (RHEL 7 container from our local repository)

# docker run --rm -it --network="syndichain" -p 11002:11002 --name rhel.da85 957c8834c8a9 bash

# docker run --rm -it --network="syndichain" -p 11003:11003 --name rhel.da85.2 957c8834c8a9 bash

On Host 2:

    # docker swarm join --token SWMTKN-1-2rxp0qlhl8o5n1hlmf0umcj8jdub19t07ndbu7mlodeb1yi4uf-ceopcr39eaaiz6bfqd4kk6ojs 169.192.215.74:2377 --advertise-addr=10.210.87.216

    # docker node ls
    ID                            HOSTNAME                      STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    yxphc861xra6ajc5llxrpp9ia *   sd-cfd1-0319   Ready               Active              Reachable           18.09.7
    2c45co3pj38u2gwqrvuawwi11     sd-d5d8-da85   Ready               Active              Leader              18.09.7

# netstat -plntu|grep -E "4789|7946|2377"
tcp6       0      0 :::2377                 :::*                    LISTEN      8562/dockerd        
tcp6       0      0 :::7946                 :::*                    LISTEN      8562/dockerd        
udp        0      0 0.0.0.0:4789            0.0.0.0:*                           -                   
udp6       0      0 :::7946                 :::*                                8562/dockerd   

Ping Host 1:

# ping 169.192.215.74
PING 169.192.215.74 (169.192.215.74) 56(84) bytes of data.
64 bytes from 169.192.215.74: icmp_seq=1 ttl=55 time=0.796 ms
64 bytes from 169.192.215.74: icmp_seq=2 ttl=55 time=0.530 ms
64 bytes from 169.192.215.74: icmp_seq=3 ttl=55 time=0.513 ms
^C
--- 169.192.215.74 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.513/0.613/0.796/0.129 ms


# ip address show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:98:63:cb brd ff:ff:ff:ff:ff:ff
    inet 10.210.87.216/23 brd 10.210.87.255 scope global noprefixroute ens192
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:20:4a:de:3f brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
871: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:f0:d9:46:4c brd ff:ff:ff:ff:ff:ff
    inet 172.23.0.1/16 brd 172.23.255.255 scope global docker_gwbridge
       valid_lft forever preferred_lft forever
885: veth4d635e9@if884: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default 
    link/ether 86:4f:5b:81:89:5c brd ff:ff:ff:ff:ff:ff link-netnsid 1
892: veth2734e0c@if891: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default 
    link/ether b2:6c:1d:94:99:3f brd ff:ff:ff:ff:ff:ff link-netnsid 4


Bring up two RHEL7 containers on Host 2:

# docker run --rm -it --network="syndichain" --name rhel.0319 -p 11001:11001 957c8834c8a9 bash

# docker run --rm -it --network="syndichain" -p 11002:11002 --name rhel.0319.2 957c8834c8a9 bash

Now, we run some commands to check our overlay network:

On Host 1:

# docker network inspect syndichain
[
    {
        "Name": "syndichain",
        "Id": "8ar2ar0z0brl5lp1lk13xo4ik",
        "Created": "2020-02-28T11:44:01.411177608-05:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "fb911a7538d05f116d7f3ed50192e198d93faadaf1fb98f55fa79eb4cfbea72d": {
                "Name": "rhel.da85",
                "EndpointID": "8caf50c904f660c750bfa3608902d78239211e7076afed68d77a46e3e141386d",
                "MacAddress": "02:42:0a:00:00:02",
                "IPv4Address": "10.0.0.2/24",
                "IPv6Address": ""
            },
            "lb-syndichain": {
                "Name": "syndichain-endpoint",
                "EndpointID": "0a122599aee9d9b84dde6383aeb432580769af886278b46310a1c65bc68a0bc9",
                "MacAddress": "02:42:0a:00:00:03",
                "IPv4Address": "10.0.0.3/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "7f037ddf9a5f",
                "IP": "10.210.87.216"
            },
            {
                "Name": "b1b424f73d55",
                "IP": "169.192.215.74"
            }
        ]
    }
]

On Host 2:

# docker network inspect syndichain
[
    {
        "Name": "syndichain",
        "Id": "8ar2ar0z0brl5lp1lk13xo4ik",
        "Created": "2020-02-28T11:42:08.565999966-05:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "351d29af85e6e8addafcf69b3c464b49f8c9720cbaf8911328ab1aadce13e170": {
                "Name": "rhel.0319.2",
                "EndpointID": "fe0ed40c7a0b8687ca4a34d7b0196e382c32f3a5f82c43233174da349a8a5b34",
                "MacAddress": "02:42:0a:00:00:04",
                "IPv4Address": "10.0.0.4/24",
                "IPv6Address": ""
            },
            "62493b5072d9c2e516a0b8bfb9d0ba6dea9c2cbca2787bf4f8c2bf5ffe021dd0": {
                "Name": "rhel.0319",
                "EndpointID": "9a5e8977ca434854da886c215f14583cc5a7c5bb8811d5408a5cae9c4a325c47",
                "MacAddress": "02:42:0a:00:00:0e",
                "IPv4Address": "10.0.0.14/24",
                "IPv6Address": ""
            },
            "lb-syndichain": {
                "Name": "syndichain-endpoint",
                "EndpointID": "37415feb73f86cc3ceb1ab4d060da076ed4fca1f0b3ebe440993c82c14258a2d",
                "MacAddress": "02:42:0a:00:00:0f",
                "IPv4Address": "10.0.0.15/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "7f037ddf9a5f",
                "IP": "10.210.87.216"
            },
            {
                "Name": "b1b424f73d55",
                "IP": "169.192.215.74"
            }
        ]
    }
]

Now, I will demonstrate the problem:

Ping rhel.da85.2 container from rhel.da85 container (both containers on the same host)

[root@fb911a7538d0 /]# ping rhel.da85.2
PING rhel.da85.2 (10.0.0.5) 56(84) bytes of data.
64 bytes from rhel.da85.2.syndichain (10.0.0.5): icmp_seq=1 ttl=64 time=0.080 ms
64 bytes from rhel.da85.2.syndichain (10.0.0.5): icmp_seq=2 ttl=64 time=0.074 ms
64 bytes from rhel.da85.2.syndichain (10.0.0.5): icmp_seq=3 ttl=64 time=0.048 ms
^C
--- rhel.da85.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.048/0.067/0.080/0.015 ms

Ping rhel.0319 container from rhel.da85 container (containers on different hosts)

[root@fb911a7538d0 /]# ping rhel.0319
PING rhel.0319 (10.0.0.14) 56(84) bytes of data.

It is stuck at this. Nothing happens.


Ping rhel.0319.2 container from rhel.0319 container (both containers on the same host)

[root@62493b5072d9 /]# ping rhel.0319.2
PING rhel.0319.2 (10.0.0.4) 56(84) bytes of data.
64 bytes from rhel.0319.2.syndichain (10.0.0.4): icmp_seq=1 ttl=64 time=0.109 ms
64 bytes from rhel.0319.2.syndichain (10.0.0.4): icmp_seq=2 ttl=64 time=0.061 ms
64 bytes from rhel.0319.2.syndichain (10.0.0.4): icmp_seq=3 ttl=64 time=0.064 ms
64 bytes from rhel.0319.2.syndichain (10.0.0.4): icmp_seq=4 ttl=64 time=0.068 ms
64 bytes from rhel.0319.2.syndichain (10.0.0.4): icmp_seq=5 ttl=64 time=0.064 ms
64 bytes from rhel.0319.2.syndichain (10.0.0.4): icmp_seq=6 ttl=64 time=0.063 ms
^C
--- rhel.0319.2 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5000ms
rtt min/avg/max/mdev = 0.061/0.071/0.109/0.018 ms


Ping rhel.da85 container from rhel.0319 container (both containers on different hosts)

[root@62493b5072d9 /]# ping rhel.da85  
PING rhel.da85 (10.0.0.2) 56(84) bytes of data.


It is stuck at this point, just like it is stuck pinging a container on 0319 from da85.

I have even explicitly enabled ip4 forwarding as per https://stackoverflow.com/a/41453306/10382340

However, still stuck with this issue for the past week. Appreciate any help.

Update:

Someone suggested running tcpdump and nsenter. Here are the logs from the tcpdump running at the container level (on the same VM where the PING request is being generated) and nsenter logs for "ARP". You can see that the ARP request never gets a response in tcpdump but nsenter shows request/reply pairs for ARP.

[root@0ae70bfa66ae /]# tcpdump -v
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
16:26:03.276397 IP (tos 0x0, ttl 64, id 34495, offset 0, flags [DF], proto ICMP (1), length 84)
    rhel.0319.syndichain > rhel.da85.syndichain: ICMP echo request, id 62, seq 1, length 64
16:26:04.276569 IP (tos 0x0, ttl 64, id 34971, offset 0, flags [DF], proto ICMP (1), length 84)
    rhel.0319.syndichain > rhel.da85.syndichain: ICMP echo request, id 62, seq 2, length 64
16:26:05.276526 IP (tos 0x0, ttl 64, id 35636, offset 0, flags [DF], proto ICMP (1), length 84)
    rhel.0319.syndichain > rhel.da85.syndichain: ICMP echo request, id 62, seq 3, length 64
16:26:06.276545 IP (tos 0x0, ttl 64, id 35935, offset 0, flags [DF], proto ICMP (1), length 84)
    rhel.0319.syndichain > rhel.da85.syndichain: ICMP echo request, id 62, seq 4, length 64
16:26:07.276542 IP (tos 0x0, ttl 64, id 36706, offset 0, flags [DF], proto ICMP (1), length 84)
    rhel.0319.syndichain > rhel.da85.syndichain: ICMP echo request, id 62, seq 5, length 64
16:26:08.276543 IP (tos 0x0, ttl 64, id 37284, offset 0, flags [DF], proto ICMP (1), length 84)
    rhel.0319.syndichain > rhel.da85.syndichain: ICMP echo request, id 62, seq 6, length 64
16:26:08.294475 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has rhel.da85.syndichain tell rhel.0319.syndichain, length 28




# nsenter --net=/var/run/docker/netns/1-khyguu9i6n tcpdump -peni any "arp"

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
11:45:23.302460   P 02:42:0a:00:00:0b ethertype ARP (0x0806), length 44: Request who-has 10.0.0.2 tell 10.0.0.11, length 28
11:45:23.302471 Out 02:42:0a:00:00:0b ethertype ARP (0x0806), length 44: Request who-has 10.0.0.2 tell 10.0.0.11, length 28
11:45:23.302485  In 02:42:0a:00:00:02 ethertype ARP (0x0806), length 44: Reply 10.0.0.2 is-at 02:42:0a:00:00:02, length 28
11:45:23.302488 Out 02:42:0a:00:00:02 ethertype ARP (0x0806), length 44: Reply 10.0.0.2 is-at 02:42:0a:00:00:02, length 28
11:45:44.294458   P 02:42:0a:00:00:0b ethertype ARP (0x0806), length 44: Request who-has 10.0.0.2 tell 10.0.0.11, length 28
11:45:44.294474 Out 02:42:0a:00:00:0b ethertype ARP (0x0806), length 44: Request who-has 10.0.0.2 tell 10.0.0.11, length 28
11:45:44.294477  In 02:42:0a:00:00:02 ethertype ARP (0x0806), length 44: Reply 10.0.0.2 is-at 02:42:0a:00:00:02, length 28
11:45:44.294479 Out 02:42:0a:00:00:02 ethertype ARP (0x0806), length 44: Reply 10.0.0.2 is-at 02:42:0a:00:00:02, length 28
11:46:05.302460   P 02:42:0a:00:00:0b ethertype ARP (0x0806), length 44: Request who-has 10.0.0.2 tell 10.0.0.11, length 28
11:46:05.302480 Out 02:42:0a:00:00:0b ethertype ARP (0x0806), length 44: Request who-has 10.0.0.2 tell 10.0.0.11, length 28
11:46:05.302483  In 02:42:0a:00:00:02 ethertype ARP (0x0806), length 44: Reply 10.0.0.2 is-at 02:42:0a:00:00:02, length 28
11:46:05.302485 Out 02:42:0a:00:00:02 ethertype ARP (0x0806), length 44: Reply 10.0.0.2 is-at 02:42:0a:00:00:02, length 28

Update 2:

From container on Source Host

[root@2baddaa98452 /]# nc -zvu rhel.da85 4789
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 10.0.0.2:4789.
Ncat: UDP packet sent successfully
Ncat: 1 bytes sent, 0 bytes received in 2.02 seconds.

Source Host TCP Dump

# tcpdump -i ens192 port 4789
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
11:28:11.706587 IP sd-cfd1-0319.59121 > sd-d5d8-da85.4789: VXLAN, flags [I] (0x08), vni 4097
IP 10.0.0.11.44806 > 10.0.0.2.4789:  [|VXLAN]

Target Host TCP Dump

# tcpdump -i ens192 port 4789
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
Ashish Chandra
  • 111
  • 1
  • 8
  • Is the issue only with ping? Are you able to connect to listening ports (curl or nc)? Is this only and always an issue when connecting to containers running on different hosts? – BMitch Mar 02 '20 at 17:16
  • Hi BMitch, the issue actually arose out of https://stackoverflow.com/questions/60384272/cannot-communicate-from-fabric-peer1-to-orderer-in-a-docker-swarm-network-on-mul and then I thought of doing this very basic test. To answer your question, cannot ping, curl or do anything with containers that are part of the swarm network AND are on other hosts. No issues with communicating with containers that are part of the swarm network AND are on the same host. – Ashish Chandra Mar 02 '20 at 22:29

2 Answers2

1

From your comment, it appears that the overlay networking ports are blocked between nodes. This could be from iptables or another firewall tool on the host, a network firewall between the nodes, or other software like VM tooling or a cloud router ACL, blocking those ports. The ports that need to be opened are:

  • TCP and UDP port 7946 for communication among nodes
  • UDP port 4789 for overlay network traffic

Ref: https://docs.docker.com/network/overlay/

BMitch
  • 231,797
  • 42
  • 475
  • 450
  • Hi BMitch,that is not the issue. I have verified and double checked that these ports are open. If you have any commands you need me to run to confirm, I can do that (I have already indicated so in the post) using nc or something else. – Ashish Chandra Mar 03 '20 at 15:10
  • @AshishChandra Run tcpdump on each host, you can filter by the ports above. Verify that all traffic sent from one host reaches the other nodes in the cluster. – BMitch Mar 03 '20 at 15:11
  • For what it's worth, I hear from others all the time saying "it's not the network" only to come back and acknowledge that it was the network being blocked. There are lots of things that can do this. Last one that surprised me was VMware NSX. https://stackoverflow.com/questions/60438128/swarm-mode-routing-mesh-not-working-instead-is-working-like-host-mode-by-defaul/60442952#60442952 – BMitch Mar 03 '20 at 15:21
  • Hi @BMitch, I have added an Update2 to my original question. It does appear that the recipient VM does not register any traffic on port 4789 (the rest 2377 and 7946 are ok). Please see and let me know if that is indeed the issue, and let me know what are the alternatives to get past this constraint we may have in our network. – Ashish Chandra Mar 03 '20 at 16:59
  • @AshishChandra if updating the network policies is not an option, you can change docker's port during the swarm init (`--data-path-port`): https://docs.docker.com/engine/reference/commandline/swarm_init/ – BMitch Mar 03 '20 at 17:04
  • Thanks @BMitch. Appreciate it. I will give that a try. – Ashish Chandra Mar 03 '20 at 17:07
  • @AshishChandra Did you ever find a solution? – BKaun Feb 18 '21 at 06:13
  • Hi @BKaun no. But it was most likely related to blocked ports on our corporate firewall and it was just a pain to go through the process of getting them open. We were also asked to move to the corporate standard of RedHat's container solution, and away from docker swarm. – Ashish Chandra Feb 21 '21 at 17:14
-1

If you want to use overlay networks and communicate across nodes, then deploy as a service, instead of as a standalone container (aka docker run).

Reifnir
  • 9
  • 2