6

I'm trying to run some calculation in loop, each calculation creates, uses and closes a pool. But the calculation only runs once and then throws an error: "Pool not running". Of course the old one is not running, but shouldn't the new one be created?

Below is a simplified example, similar to my code. More freakishly, in my actual code calculation runs 7 times before crashing, so I'm really confused what's the problem. Any advice appreciated!

from pathos.multiprocessing import ProcessingPool as Pool

def add_two(number):  
    return (number + 2)

def parallel_function(numbers):
    pool = Pool(10)
    result = pool.imap(add_two, numbers)
    pool.close()
    pool.join()    
    return(result)

sets=[
    [1, 2, 3],
    [2, 3, 4],
    [3, 4, 5]
]

for one_set in sets:
    x = parallel_function(one_set)
    for i in x:
        print(i)
noxdafox
  • 14,439
  • 4
  • 33
  • 45
Anna
  • 199
  • 1
  • 10
  • 1
    Is [this](https://stackoverflow.com/questions/52250054/python-valueerror-pool-not-running-in-async-multiprocessing/52250129) what you're looking for? – Jamie Feb 12 '20 at 12:19
  • not exactly. In this example they wanted to do all calculations in one pool, when I use different pools (with different number of processes in general). It is important for me to close the pool inside "parallel_function", because it will be used separately – Anna Feb 12 '20 at 12:21

2 Answers2

4

This is a pathos limitation which implements the Pool using the singleton pattern.

This is the related issue ticket.

I would recommend you to use another Pool of Workers implementation.

noxdafox
  • 14,439
  • 4
  • 33
  • 45
  • Wow, never thought that's what it was! Do you have any suggestion on what to use instead? – Anna Feb 12 '20 at 12:49
  • As long as you don't have special needs, the builtin `multiprocessing.Pool` and `concurrent.futures.ProcessPoolExecutor` are just fine. If those do not satisfy your needs, you can take a look at [`pebble`](https://pypi.org/project/Pebble/) or [`billiard`](https://pypi.org/project/billiard/) – noxdafox Feb 12 '20 at 12:58
1

The following assumes that pathos acts the same as multiprocessing. The following would be the problem if you were using multiprocessing.

The problem is that your function closes the pool before the imap is finished:

def parallel_function(numbers):
    pool = Pool(10)
    result = pool.imap(add_two, numbers)
    pool.close()
    pool.join()    
    return(result)

This should be written as:

def parallel_function(numbers):
    with Pool(10) as pool:
       yield from pool.imap(add_two, numbers)
Dan D.
  • 73,243
  • 15
  • 104
  • 123