I have a Dask process that triggers 100 workers with a map function:
worker_args = .... # array with 100 elements with worker parameters
futures = client.map(function_in_worker, worker_args)
worker_responses = client.gather(futures)
I use docker where each worker is a container. I have configured docker to spawn 20 workers/containers, like so:
docker-compose up -d --scale worker=20
The problem is that my machine crashes because the map function triggers 20 workers in parallel and that makes memory and CPU exceed the maximum.
I want to keep the configuration of 20 workers because I use the workers for other functions that don't require large amount of memory.
How to limit the map function to, say, 5 workers in parallel?