Can we limit the number of request going to EC2 instance via Load balancer(ELB or ALB)?
We have requirement to load balance WebSocket traffic to EC2. ALB load balance my EC2. EC2 is configured with Auto scaling group and cloud watch alarm. When CPU utilisation reached 70%, auto scaling will spin a new instance and ALB will start routing traffic to this new instance. However, ALB will send traffic to same old instance(which have reached 70%, like round robin fashion). This behaviour triggers alarm again and a new EC2 instance will spin again which is not right. I want overcome this behaviour. If I can restrict ALB to stop sending traffic to EC2 instance which are at 70% load already and send traffic after its load reduces, then my issue is resolved.
If we can limit the traffic to EC2 instance via ALB by numbers, let say ALB should route the request to EC2 instance only 1) when max connection is less than 500. 2) OR valid both condition and route request.(max connection <500 and CPU utilization is > 70%)