Auto-scaling Instances in multi-AZ environment

Question

I am planning to build an auto-scaling group (ASG) in a multi-AZ (AZ = availability zone) network. Let's say that we ran some diagnostics and discovered that we need at least 8 instances for normal load, and 24 instances during peak times.

Here's a sample screenshot console.

I am confused whether these 8 instances (or 24 instances) will be run across AZs or in one AZ. Moreover, if I have to force ASG to have, say, 8 instances each in an AZ, how do I do that?

score 7 · Accepted Answer · answered Oct 13 '20 at 11:01

7

When you create the Auto Scaling group, you nominate the AZs in which instances should be launched.

Auto Scaling will aim to keep the number of instances in each AZ balanced. For example, when launching a new instance, it will launch in the AZ with the fewest number of instances in the Auto Scaling group (or a random AZ if they are equal). When terminating an instance, it will select an instance in the AZ with the most instances in the Auto Scaling group (or a random AZ if they are equal).

Therefore, to ensure 8 instances in each AZ, the Auto Scaling group would need to have an instance count equal to 8 times the number of configured AZs.

If you wish to ensure that 8 instance will be running at all times, and the Auto Scaling group is using 3 AZs, then there is the (small) possibility that one AZ might fail. If this happens, Auto Scaling will launch more instances in the remaining AZs. If your application cannot wait for these extra instances to fail, then it will need to have 4 instances in each of the 3 AZs. This way, if one AZ fails, there will still be two AZs each with 4 instances, giving 8 instances running.

Therefore:

Determine whether your system can handle the delay involved in launching replacement instances
If it can, then simply launch the minimum number of instances
If it cannot handle the delay, then launch enough instances such that there are sufficient instances even if one AZ fails

answered Oct 13 '20 at 11:01

John Rotenstein

241,921
22
380
470

1

Thanks John. I'm learning so much from your detailed responses that I wanted to sincerely thank you for helping newbies like me. I cannot explain in words how much good I feel after learning from your posts. If I've understood correctly, I think the gist is that if we know minimum instance count, and if we cannot absorb any delay, then we should set `minimum instance count = min count * # of AZs` so that they are balanced automatically. – awsuser2021 Oct 13 '20 at 18:02
1

Not quite. It should launch enough instances so that the failure of one AZ will still provide enough instances. So the formula would be something like `Number to launch = (min_required / (Number_of_AZs - 1) ) * Number_of_AZs` – John Rotenstein Oct 13 '20 at 22:22
Thanks John. I have one more Q, if you don't mind. In above example, we determined (from load test) that we need min. 8 (regular load) & max. 24 (peak load) instances. Let's assume there are 2 AZs. In this case, the parameters (I believe) will be min = 8, max = 24, desired = 16 (as per formula above). Am I right? If yes, I'm concerned that during peak load, if one AZ crashes, I will have 12 instances on avg. To fix this, I can multiple min/max/desired by 2 (i.e. min:16 etc.). However, this isn't cost-effective. I read https://stackoverflow.com/a/39406170/14369982. Can you please guide me? – awsuser2021 Oct 13 '20 at 23:20
1

If you do not want to wait for additional instances to start, then you would require `Minimum = 16` since that would give 8 instances in each of the 2 AZs in case on AZ fails. It would be lower cost to run across 3 AZs, since it would require `Minimum = 12` (4 in each AZ). Frankly, the likelihood of an AZ failing is quite low, so it is a trade-off between your risk appetite and the cost of running extra instances. You say that you need a minimum of 8, but you might be willing to have a lower number for a few minutes in the rare event of an AZ failure while Auto Scaling launches more instances. – John Rotenstein Oct 14 '20 at 02:34
Thanks John. I got it. I have one last question. In your example, we calculated `minimum = 16`. To build on this, I believe with 2 AZs, `max will be 24*2 = 48` because `peak load = 24`. Is that right? If so, what will be **desired** capacity ? Could you please guide me? I believe I am looking for guiding principles for calculating desired number of instances. – awsuser2021 Oct 14 '20 at 05:55
1

Desired Capacity = "How many you want right now". Auto Scaling will attempt to give you that many instances by launching/terminating instances until it reaches that number. The _actual_ number of instances will never go below Minimum or above Maximum, even if Desired Capacity requests a number outside those bounds. | You can set Maximum to whatever you want -- it simply sets a limit that Auto Scaling will not exceed. – John Rotenstein Oct 14 '20 at 07:30

score 2 · Answer 2 · edited Feb 09 '23 at 05:24

Auto Scaling keeps(balances) the number of instances across multiple AZs evenly.

For example:

You set 8 instances as minimum.

If you set 2 AZs for Auto Scaling, each AZ has 4 instances (4 + 4 = 8).

If you set 3 AZs for Auto Scaling, 2 AZs will have 3 instances each and 1 AZ has 2 instances (3 + 3 + 2 = 8).

In total, at least 8 instances are kept across multiple AZs for normal load.

You set 24 instances as maxmum.

If you set 2 AZs for Auto Scaling, each AZ has 12 instances (12 + 12 = 24).

If you set 3 AZs for Auto Scaling, each AZ has 8 instances (8 + 8 + 8 = 24).

In total, at most 24 instances are kept across multiple AZs during peak times.

AWS will answer your question as well --> Q: How does Amazon EC2 Auto Scaling balance capacity?

Auto-scaling Instances in multi-AZ environment

2 Answers2