0

I am using pathos.multiprocessing in python2, but I think it is the same question with the standard multiprocessing. My code looks like the following:

results = pool.map(func, list_of_args, chunksize=1)

I have read that pool.map returns results in the same order as the arguments were in, but that the order of computation is arbitrary (source: Python 3: does Pool keep the original order of data passed to map?)

However, I would like to ensure that the order of computation is not arbitrary and that it matches the order in which the arguments were presented. Something like:

results = pool.map(func, list_of_args, chunksize=1, compute_in_given_order=True)

To be clear, my question is not about the order in which the processes finish, but rather the order in which they start. I would like to ensure that the job representing argument 3 in the list begins before the job representing argument 4.

Is this possible? If not, why not?

numberwang
  • 39
  • 5
  • 2
    Since python2 has been deprecated out of existence, I'd recommend upgrading to python3 first – inspectorG4dget Dec 01 '20 at 16:47
  • It is possible: stop using `multiprocessing` and just run `func` on `list_of_args` in a loop. The point of using `multiprocessing.pool` is to let multiple jobs happen at once. Using `map` will submit them in order, and return them in the right order. But it can't guarantee they process in that order - what if job 2 finishes before job 1 is done? – bbayles Dec 01 '20 at 16:52
  • I am happy for job 2 to finish before job 1 is done. However, I don't want job 4 to be picked up by a core before job 3 has been picked up. Does that make sense? – numberwang Dec 01 '20 at 16:55
  • in python3 `pool.map(fn, args)` is always in order – Grzegorz Krug Dec 01 '20 at 16:58
  • @bbayles, I've edited my question appropriately. But I think you're saying that pool.map always does the exact thing that I'm asking about anyway. In which case, does anyone know why the accepted answer for https://stackoverflow.com/questions/41273960/python-3-does-pool-keep-the-original-order-of-data-passed-to-map mentions that the order of computation is arbitrary? – numberwang Dec 01 '20 at 17:06
  • "I don't want job 4 to be picked up by a core before job 3 has been picked up. Does that make sense?" - it makes sense, but it's not how `Pool` is intended to be used. You could make it work by having the jobs be dependent on earlier ones finished - use one of the "Sharing state between processes" constructs for this. – bbayles Dec 01 '20 at 17:43
  • @bbayles thanks for replying but I am even more confused now. Two things. Firstly what did you mean by "using `map` will submit them in order"? Secondly I don't think your suggestion solves the problem. I don't want to say that job 3 has to be dependent on job 2 finishing, because job 1 might finish before job 2, and then job 3 can take its place. – numberwang Dec 01 '20 at 17:58

0 Answers0