How can I solve the Multiprocessing.Pool problem?

Question

I'm trying to use multiprocessing as following python codes.

Code:

from multiprocessing import Pool

def fibo(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fibo(n-1) + fibo(n-2)

def print_fibo(n): 
    print(fibo(n))

num_list = [31, 32, 33, 34]

pool = Pool(processes=4) 
pool.map(print_fibo, num_list)

result:

In[1]: runfile('D:/PYTHONcoding/test.py', wdir='D:/PYTHONcoding')

This result seems like an infinite loop..

I got 'intel xeon cpu/ 16G RAM/ 1080ti gpu/ etc'. Please let me know how to use Multiprocess.Pool.

score 1 · Answer 1 · answered Nov 07 '18 at 01:53

You should use the following condition in your main module:

if __name__ == '__main__':
    pool = Pool(processes=4)
    pool.map(print_fibo, num_list)

With this change, you code would output (takes about 5 seconds for me on an average laptop):

score 0 · Answer 2 · answered Nov 07 '18 at 01:54

Your usage is just fine. I put in a little more clarification and ran on a handy cluster:

Calculated fibo( 31 ) =  1346269 with 4356617 calls
Calculated fibo( 32 ) =  2178309 with 7049155 calls
Calculated fibo( 33 ) =  3524578 with 11405773 calls
Calculated fibo( 34 ) =  5702887 with 18454929 calls

You're making over 40M calls to fibo; that might have slow response, depending on which Xeon(R) is running your box. If you want to speed things up, try dynamic programming / memoization:

calls = 0 memo = {0:0, 1:1}

def fibo(n):
    global calls, memo

    calls += 1
    # if n> 0 and n%10 == 0: print("ENTER n =", n)
    if n not in memo:
        memo[n] = fibo(n-1) + fibo(n-2)
    return memo[n]

Output:

Calculated fibo( 31 ) =  1346269 with 61 calls
Calculated fibo( 32 ) =  2178309 with 63 calls
Calculated fibo( 33 ) =  3524578 with 65 calls
Calculated fibo( 34 ) =  5702887 with 67 calls

score 0 · Accepted Answer · answered Nov 07 '18 at 02:09

blhsing's answer identifies the root of the problem: on Windows, multiprocessing requires running a new instance of Python for each computational process. Each new Python loads the file(s) that define the various functions, then waits for directives from the master / controlling Python that spawned them—but if the Python file(s) that multiprocessing loads spawn additional Pythons unconditionally, without an if __name__ == '__main__' test, those additional Pythons spawn more Pythons which spawn yet more Pythons, without end.

(Essentially, the problem here is recursion without a base case.)

Prune's answer, suggesting memoization, is also reasonable. Note that memoization can be done without global variables. See What is memoization and how can I use it in Python? for a prepackaged version. One I like to use as a demo makes use of the fact that you can set attributes on functions:

def fibo(n):
    if n <= 1:
        return 0 if n < 1 else 1
    ns = str(n)
    if not hasattr(fibo, ns):
        setattr(fibo, ns, fibo(n - 1) + fibo(n - 2))
    return getattr(fibo, ns)

We handle the base cases up front to avoid recursion. Then we turn the argument n (which is presumably a number) into a string ns for getattr and setattr. If the memoized answer is not available, we set it with a recursive call; then we return the memoized answer.

Thank you. In this code, adding (if __name__ == '__main__') works well. I have additional question. I want to use the Multiprocessing.Pool inside function (I mean a def..). Than, could I just coding like following example? >> def function_name(param): ... if __name__ == '__main__': ... pool = Pool(processes = num) ... pool.map(another_def, range) — Hyojeong Moon, Nov 07 '18 at 02:34
@HyojeongMoon: It's too hard to read code in comments. If you have a new question, post a new question. (But be sure to search for existing answers first.) — torek, Nov 07 '18 at 03:44

How can I solve the Multiprocessing.Pool problem?

3 Answers3