I am running nginx as part of the docker-compose template.
In nginx config I am referring to other services by their docker hostnames (e.g. backend
, ui
).
That works fine until I do that trick:
docker stop backend
docker stop ui
docker start ui
docker start backend
which makes backend and ui containers to exchange IP addresses (docker provides private network IPs on a basis of giving the next IP available in CIDR to each new requester). This 4 commands executed imitate some rare cases when both upstream containers got restarted at the same time but the nginx container did not. Also, I believe, this should be a very common situation when running pods on Kubernetes-based clusters.
Now nginx resolves backend
host to ui's IP and ui
to backend's IP.
Reloading nginx' configuration does help (nginx -s reload
).
Also, if I do nslookup from within the nginx container - the IPs are always resolved correctly.
So this isolates the problem to be a pure nginx issue around the DNS caching.
The things I tried:
- I have the resolver set under the http {} block in nginx config:
resolver 127.0.0.11 ipv6=off valid=10s;
- Most common solution proposed by the folks on the internet to use variables in proxy-pass (this helps to prevent nginx to resolve and cache DNS records on start) - that did not make ANY difference at all:
server {
<...>
set $mybackend "backend:3000";
location /backend/ {
proxy_pass http://$mybackend;
}
}
- Tried adding resolver line into the location itself
- Tried setting the variable on the http{} block level, using
map
:
http {
map "" $mybackend {
default backend:3000;
}
server {
...
}
}
- Tried to use openresty fork of nginx (https://hub.docker.com/r/openresty/openresty/) with
resolver local=true
None of the solutions gave any effect at all. The DNS caches are only wiped if I reload nginx configuration inside of the container OR restart the container manually.
My current workaround is to use static docker network declared in docker-compose.yml. But this has its cons too.
Nginx version used: 1.20.0 (latest as of now) Openresty versions used: 1.13.6.1 and 1.19.3.1 (latest as of now)
Would appreciate any thoughts
UPDATE 2021-09-08: Few months later I am back to solving this same issue and still no luck. Really looks like the bug in nginx - I can not make nginx to re-resolve the dns names. There seems to be no timeout to nginx' dns cache and none of the options listed above to introduce timeouts or trigger dns flush work.
UPDATE 2022-01-11: I think the problem is really in the nginx. I tested my config in many ways a couple months ago and it looks like something else in my nginx.conf prevents the valid
parameter of the resolver
directive from working properly. It is either the limit_req_zone
or the proxy_cache_path
directives used for request rate limiting and caching respectively. These just don't play nicely with the valid
param for some reason. And I could not find any information about this anywhere in nginx docs.
I will get back to this later to confirm my hypothesis.