0

I have the following AWS setup:

  • ECS Cluster with 2 EC2 instances each running in its own subnet which are within a different AZ
  • Several Microservices which are only running each one task as they are stateful
  • One Internal Application Load Balancer with target groups for each microservice mapped by port

AWS Setup

Now image the following scenario: Service 1 wants to communicate with Service 2 which is running on the other EC2 instance in the other AZ. As URL of the Service 2 I use the DNS name of the Load Balancer with port: internal-load-balancer:8082/path. This is necessary as I'm using rolling deployment so the microservices move between the two EC2 instances after each deployment.

Now if I execute host internal-load-balancer I get back 2 IP addresses one for the Load Balancer running in Subnet 1 and one running in Subnet 2:

  • 10.0.0.11
  • 10.0.32.11

If I execute now the following curl commands on Service 1:

  • curl 10.0.0.11:8082/ I get back a Gateway Timeout
  • curl 10.0.32.11:8082/ works as expected and I get back a 200
  • curl 10.0.32.10:8082/ Also works

So why in the hell does this work if I'm using the Load Balancer in the same subnet but not the other one? It also works if I directly contact the EC2 instances in the other AZ. Problem is that the DNS record resolves to both IP addresses and the microservice is just randomly using one of them so half of my requests work the other half time out.

So what am I doing wrong here??? Thanks in advance for your support here :)

tzwickl
  • 1,341
  • 2
  • 15
  • 31
  • FWIW I set up a basic parallel example in AWS and mine is working so I don't think there's any theoretical problem. The only thing I can think of is - could the security group on your EC2 instances be setup to allow access from the other EC2 instances based on their sec group membership as opposed to an IP address? If 10.0.0.10 is allowed to access 10.0.32.10 due to sec group membership and 10.0.32.0/19 is allowed but 10.0.0.0/19 isn't - that could theoretically do this (I think). – jefftrotman Aug 10 '19 at 14:24
  • Thanks for trying this out :) I just come across this question on StackOverflow (https://stackoverflow.com/questions/9257514/amazon-elb-in-vpc) and this seems to have solved my problem :D so all I did is that I moved the LB from the private subnet to the public and now it works no idea what the difference is or why this is necessary... Maybe some bug on AWS side or it is supposed to work this way – tzwickl Aug 10 '19 at 16:13

1 Answers1

0

So seems like I found a solution for this problem here. Basically what I did to solve this problem is that I moved the application load balancer to the public subnet which has an internet gateway connected and now both load balancer work without any problems... I have no idea why it only works this way but I'm glad that I found a solution to this problem :)

Anyone here who could explain to me why the ALB needs to be in a public subnet?

tzwickl
  • 1,341
  • 2
  • 15
  • 31