I'm playing around with Docker swarm,
I have a three nodes cluster, 1 manager, and 2 worker nodes. I'm using VIP
for all my services.
I had a weird situation where I restarted the worker node.
I executed docker node ls
and the worker node was Ready
.
docker service ls
would show me that the replications of the containers in the worker were good.
The problem: I couldn't join the node though the ingress network. No container in other nodes was able to access a container in that worker node.
I checked the containers they were all joining the ingress network.
I curled the containers from within the same node and they responded.
I pinged the service name (in the same malfunctioning node) from a container and it worked.
I curled the worker containers in the worker from the manager doesn't work!!
I curled with the ip address of the worker and they responded.
I restarted the worker node, but the issue persisted, then I restarted the whole cluster and it worked again!
Is there any explanation to what I just witnessed ?
I'm most worried that this would happen in a production environnement.
Thank you in advance.