I've developed a Docker based application comprised of multiple microservices. It has to consume Amazon SQS messages and processes them. At first I wanted to use AWS Elastic Beanstalk, but then I fell over the EC2 Container Service. Now I don't know which one to choose.
As of now, Elastic Beanstalk supports Multi-Container-Environments. That's great because every microservice has its own application server inside a docker container. The next problem is scaling:
I don't know how the scaling mechanism works. For example: I have 5 docker containers in my Elastic Beanstalk Environment. Now only the fifth docker container is under heavy load, because it has a huge amount of SQS messages to process, the other four are nearly idle, because they don't need much CPU or maybe don't have a lot of SQS messages. Let's assume the 5th container runs a JBoss application server. As far as i know, the server only can consume a limited amount of parallel requests even if there is enough CPU/memory available.
If the JBoss Docker container isn't able to handle the amount of requests, but there is enough CPU/memory available, of course I want to automatically start a second Docker/JBoss container on the same instance. But what happens, if I don't have enough CPU/memory? Of course I want to spin on a second instance, which is configurable through a auto-scaling group in EB. Now a second instance spins up, but every container except of the 5th is nearly idle, of course I don't want them to spawn 4 unnecessary at the second instance too, which would be a waste of resources. Only the 5th should spawn and the others should scale like the 5th scale based on configurable parameters like e.g.: CPU/memory/SQS.
I don't exactly know if Amazon ECS is doing that, or if it's possible at all, but I really can't find any source on the internet about this topic, which is in general said, scaling based on instances/containers.