Scaling a Python async gRPC Server

Question

When creating an async gRPC server, we can scale the number of Python threads with the first argument:

import grpc
from concurrent import futures

server = grpc.aio.server(futures.ThreadPoolExecutor(max_workers=2))

Given the fact that the GIL prevents multiple OS threads from scaling the server, have the devs behind grpcio added some way in which you can scale the server by making use of more CPU threads?

A duplicate (and outdated) answer is this: What the purpose of ThreadPoolExecutor in grpc server?. This answer is already more than 4 years old and back then it was of limited use to add more workers.

I was thinking of profiling, but I have not yet set that up.

Threads in Python, like async, provide "concurrency" but not "parallelism". So they are useful for code which is performing IO operations, such as a client or server, since even with the GIL a thread can make progress while others are idle waiting for data. So you can gain benefit by increasing thread count until CPU utilisation under load stops increasing. — Anentropic, Aug 21 '23 at 16:14
@Anentropic thank you for your answer. Does this mean that scaling a (virtual) machine / container / pod, that runs only a Python (async) gRPC app, beyond one CPU core is a waste of resources? — Sjoerd van den Bos, Aug 22 '23 at 15:10
You have two dimensions: cpu cores and threads. With Python there is one process per cpu core (i.e. you have to run multiple instances of the app, multiple Python processes, to use more cpu cores). Each Python process can run multiple threads. If each process is using < 100% of a single cpu core then you can try increasing the number of threads. At some point increasing the threads per process won't improve utilisation any further and instead will start to increase the latency of your responses. Finding the right number is by observing metrics of the system when under typical load. — Anentropic, Aug 22 '23 at 15:17
Each python process will only use a single cpu core (unless explicitly using `multiprocessing` to schedule work into sub-processes). If you have a multi-core server you need to run multiple instances of the Python app to be able to use all the cores — Anentropic, Aug 22 '23 at 15:18

Scaling a Python async gRPC Server

0 Answers0