1

I'm using map_async with processes that return a ton of data. The normal map_async results in the data being stored in memory, then returned when everything is processed. To get around this, I've used a generator approach from:

Combining itertools and multiprocessing?

However, this doesn't make full use of multi-threading (as in, if you have 29 threads finished and 1 thread hanging, it won't start the next batch of jobs until everyone is done). Is there a way to have the map_async or does there exist a similar function which will send its returns to a callback function as each thread finishes?

Community
  • 1
  • 1
Chrismit
  • 1,488
  • 14
  • 23

1 Answers1

0

What you want is to use a producer-consumer-based solution. The producer puts tasks in a multiprocessing.Queue, and the consumers (subprocesses) get and processes them, in a loop.

This is a good SO question with a (detailed) possible solution.

Community
  • 1
  • 1
shx2
  • 61,779
  • 13
  • 130
  • 153