How is Python scaling with Gunicorn and Kubernetes?

Question

I am going to deploy a Python Flask Server with Docker on Kubernetes using Gunicorn and Gevent/Eventlet as asynchronous workers. The application will:

Subscribe to around 20 different topics on Apache Kafka.
Score some machine learning models with that data.
Upload the results to a relational database.

Each topic in Kafka will receive 1 message per minute, so the application needs to consume around 20 messages per minute from Kafka. For each message, the handling and execution take around 45 seconds. The question is how I can scale this in a good way? I know that I can add multiple workers in Gunicorn and use multiple replicas of the pod when I deploy to Kubernetes. But is that enough? Will the workload be automatically balanced between the available workers in the different pods? Or what can I do to ensure scalability?

score 4 · Accepted Answer · answered Sep 22 '18 at 17:33

4

I recommend you set up an HPA Horizontal Pod Autoscaler for your workers.

It will require to set up support for the metrics API. For custom metrics on the later versions of Kubernetes heapster has been deprecated in favor of the metrics server

If you are using the public Cloud like AWS, GCP, or Azure I'd also recommend setting up an Autoscaling Group so that you can scale your VMs or server base on metrics like CPU utilization average.

Hope it helps!

answered Sep 22 '18 at 17:33

Rico

58,485
12
111
141

I guess the HPA horizontally scales the pods according to how much I need. But will the load balancing happen automatically between my workers and pods? So in theory 5 pods with 4 workers each would be enough to handle 20 messages every minute from Kafka? – danielo Sep 22 '18 at 21:04
Yes. The load balancing should happen automatically if you have it setup correctly. – Rico Sep 22 '18 at 22:27
Could there not be a situation where only one instance listens to the single Kafka topic and instead of spreading out the load it will by consumed by one single instance? – danielo Sep 25 '18 at 19:46
This really depends on how you configure your consumers. – Rico Sep 25 '18 at 19:50

How is Python scaling with Gunicorn and Kubernetes?

1 Answers1

Linked