I deploy an nginx proxy service and a rails app service into a docker swarm. The nginx depends on the app in my docker-compose file.
My nginx.conf file directs traffic to my upstream app service (exposed on port 3000) like so (only showing the upstream part).
upstream puma {
server app:3000;
}
My docker-compose file looks like so:
version: '3.1'
services:
app:
image: my/rails-app:latest
networks:
- proxy
web:
image: my/nginx:1.11.9-alpine
command: /bin/sh -c "nginx -g 'daemon off;'"
ports:
- "80:80"
depends_on:
- app
networks:
- proxy
networks:
proxy:
external: true
My host is setup to be swarm manager.
This all works totally fine - there are no problems.
However, even though I have a depends section in my docker-compose file - the app service may not be completely (?) ready by the time the nginx service starts up, so when the upstream service config part tries to DNS resolve "app:3000", it seems like it is not finding it completely. So when I visit my site, I find the following error message in my nginx logs:
2017/02/13 10:46:07 [error] 8#8: *6 connect() failed (111: Connection refused) while connecting to upstream, client: 10.255.0.3, server: www.mysite.com, request: "GET / HTTP/1.1", upstream: "http://127.0.53.53:3000/", host: "preprod.local"
If I kill the docker container that is running the nginx service, and swarm reschedules it a moment later and it returns, if I then visit the same URL it works completely fine, and the request is passed successfully upstream to app:3000.
How can I prevent this from happening - where the startup timings are out by a little bit and at the time when nginx starts it can't yet properly resolve my swarm service called app:3000 - and instead it is attempting to pass the traffic onto an IP address ....
BTW - the same happens if I reboot my virtual machine - when docker (in swarm mode) brings up the services again - I can end up with the same problem. Restarting the nginx container solves the problem.