29

The objective is to get a mixed OS Docker swarm running using Linux servers and Windows 10 Machines running Docker For Windows

Currently Windows workers are theoretically supported on mixed os swarms provided the --endpoint-mode flag is set to 'dnsrr'. This is explained here. However attempts to use traefik to route to a simple docker whoami image stefanscherer/whoami image have failed.

Minimal Failing Example

// On (Linux) Manager Node:
docker swarm init --advertise-addr <hostaddress> --listen-addr <hostaddress>:2377

// On (Windows 10) Worker Node:
docker swarm join <jointoken>

// On Manager Node:
docker network create --driver=overlay traefik-net

docker service create \
    --name traefik \
    --constraint=node.role==manager \
    --publish 80:80 --publish 8080:8080 \
    --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
    --network traefik-net \
    traefik \
    --docker \
    --docker.swarmmode \
    --docker.domain=traefik \
    --docker.watch \
    --web

docker service create \
    --name whoami \
    --label traefik.enable=true \
    --label traefik.frontend.rule=Host:whoami.docker \
    --label traefik.protocol=http \
    --label traefik.docker.network=traefik-net \
    --label traefik.backend.loadbalancer.method=drr \
    --label traefik.backend=whoami \
    --network traefik-net \
    --mode global\
    --label traefik.port=80 \
    stefanscherer/whoami

Traefik successfully sets up backend rules, to check the routing I used the traefik dashboard to find out the URL that is routed to by the rule e.g. '10.0.0.12:8080'. I then compare this with the IP address of each task, the task can be viewed with docker service ps, and their address' found using

docker inspect <taskID> \
    --format '{{ range .NetworksAttachments }}{{ .Addresses }}{{ end }}'

The Problem

A HTTP request with a header 'Host:whoami.docker' sent to the IP of the manager will succeed when routed to the manager and fail with 504 Gateway Timeout when routed to the Windows Task on the Windows worker.

Matt Hawes
  • 291
  • 2
  • 7
  • can you list the exact commands that you have used to check the traefik routing – varnit Aug 22 '17 at 16:17
  • @varnit I've edited the question to explain. – Matt Hawes Aug 24 '17 at 22:35
  • One more thing Matt which IP you are using for testing purposes ? – varnit Aug 25 '17 at 00:32
  • Are you using docker machine ip for windows host or Windows host IP ? – varnit Aug 25 '17 at 00:32
  • @varnit Thanks for helping out, could you clarify what you mean by 'which IP are you using'. 'Using' in what sense? – Matt Hawes Aug 25 '17 at 10:45
  • I mean which IP you are using for testing while using curl – varnit Aug 25 '17 at 12:58
  • @varnit ah, that would be the IP address of the manager node, `` in my example above. – Matt Hawes Aug 25 '17 at 13:11
  • The problem is that windows and Mac does not run docker natively they use small linux vm on the top of them so for windows machine you have to you use docker machine ip instead or Windows host IP give it a try and let me know if that works – varnit Aug 25 '17 at 13:22
  • You can find docker machine ip using this command docker-machine ip – varnit Aug 25 '17 at 13:26
  • Did my suggestion work ? – varnit Aug 26 '17 at 03:46
  • 1
    @varnit Actually as I am using windows containers on the windows worker it doesn't use the Linux VM on top. Additionally it shouldn't mater which node you use. The idea is to use traefik to route the requests. – Matt Hawes Aug 29 '17 at 14:12
  • @MattHawes thanks for confirming I'm not crazy. I attached a bounty to this question which will hopefully draw out some answers. – Thorn G Aug 29 '17 at 22:46
  • @TomG you get 504, can you please confirm what happens when you do the curl from the container? 1) Exec into the container in the manager node (Linux) 2) Curl from the container to endpoint your are trying to hit I see the problem here differently, I believe you are running stuff off overlay network by the compose you have posted above and the port is not exposed on host level. Can you try and confirm the above please? Thanks! – TheeCodeDragon Nov 10 '17 at 14:45
  • @TheeCodeDragon Your comment is not entirely clear. What container do you suggest running curl from -- the one Traefik is running in? `docker exec curl 10.0.0.13:8080` gives me an error message stating that curl is not found in $PATH. – Thorn G Dec 11 '17 at 22:07
  • docker run -it -d containerid bin/bash curl http://10.0.0.13:8080, replace the containerid with the traefik container and share the output please. Thanks – TheeCodeDragon Dec 12 '17 at 16:26
  • Btw, the above is only in case you are going to a image like alpine to do a quick test, otherwise the exec should suffice. The other thing I asked earlier was to exec "into" the container and try the curl. – TheeCodeDragon Dec 12 '17 at 18:31
  • @lifeisfoo, I'm a bit confused about your edits, I was getting 502 bad gateway but you've just changed my question to 504 gateway timeout? – Matt Hawes Feb 14 '18 at 16:39
  • @MattHawes I can't reproduce it right now, so please edit your question as you want. Thank you – lifeisfoo Feb 15 '18 at 10:06
  • @Matt Hawes did you come up with a solution for this problem? – Benedikt Schmeitz Jul 31 '18 at 11:31
  • @BenediktSchmeitz Unfortunately not! I've not tried again recently however – Matt Hawes Aug 01 '18 at 20:59

2 Answers2

3

You're missing setting --endpoint-mode=dnsrr to your whoami service.

docker service create \
--name whoami \
--label traefik.enable=true \
--label traefik.frontend.rule=Host:whoami.docker \
--label traefik.protocol=http \
--label traefik.docker.network=traefik-net \
--label traefik.backend.loadbalancer.method=drr \
--label traefik.backend=whoami \
--network traefik-net \
--mode global\
--label traefik.port=80 \
--endpoint-mode=dnsrr
stefanscherer/whoami

Setting endpoint-mode dnsrr will disable VIP address which probably is causing the issue.

Miq
  • 3,931
  • 2
  • 18
  • 32
0

I had the same problem when using the stefanscherer/whoami image. Using microsoft/dotnet-samples:aspnetapp works though, so the error seems related to the image.

I'm using the following setup:

Ubuntu 16.04

  • Docker 18.03.1-ce
  • Run as Manager
  • Runs traefik

Windows 1803

  • Docker 18.03.1-ee-2
  • Runs as Worker (joining as Manager did not work)
  • Runs microsoft/dotnet-samples:aspnetapp
Dresel
  • 2,375
  • 1
  • 28
  • 44