I have the following AWS setup:
- ECS Cluster with 2 EC2 instances each running in its own subnet which are within a different AZ
- Several Microservices which are only running each one task as they are stateful
- One Internal Application Load Balancer with target groups for each microservice mapped by port
Now image the following scenario:
Service 1 wants to communicate with Service 2 which is running on the other EC2 instance in the other AZ. As URL of the Service 2 I use the DNS name of the Load Balancer with port: internal-load-balancer:8082/path
. This is necessary as I'm using rolling deployment so the microservices move between the two EC2 instances after each deployment.
Now if I execute host internal-load-balancer
I get back 2 IP addresses one for the Load Balancer running in Subnet 1 and one running in Subnet 2:
- 10.0.0.11
- 10.0.32.11
If I execute now the following curl commands on Service 1:
curl 10.0.0.11:8082/
I get back a Gateway Timeoutcurl 10.0.32.11:8082/
works as expected and I get back a 200curl 10.0.32.10:8082/
Also works
So why in the hell does this work if I'm using the Load Balancer in the same subnet but not the other one? It also works if I directly contact the EC2 instances in the other AZ. Problem is that the DNS record resolves to both IP addresses and the microservice is just randomly using one of them so half of my requests work the other half time out.
So what am I doing wrong here??? Thanks in advance for your support here :)