My Azure App Service Plan is behaving strangely when scaling in and out. The unconfigurable load balancer that I get with my plan seems to be redirecting traffic in an incorrect way. My plan hosts multiple C# .NET API's, with only one of them serving traffic in below screenshot. ARR affinity is turned off for all apps in the plan.
Sometimes when scaling down from 3 to 2 instances, all traffic suddenly goes to only one instance instead of the 2 that are still up and ready to serve request (causing CPU overload one no requests being served). The same can happen when scaling out from 2 to 3 instances, so only one of the currently 2 processing instances continue to process until the new third instance is up and running (causing CPU overload again). Please see picture for better exaplanation.
Looking at thread dumps it is waiting for System.Threading.Monitor.Wait with the call stack only being System.Threading stuff. So it isn't giving me much either.