1

I was taught that to import a variable to a child process in a multiprocessing pool, you needed to use an initializer.

Strangely, I can call variables defined in the main loop in the child process, without using an initializer:

import multiprocessing
import numpy as np

def ChildFun(i):
    print(myValue)
    print(f'Processing the index {i}')

if __name__ == "__main__":
    myValue = 'This should not appear'
    myList = np.arange(5)
    with multiprocessing.Pool() as pool:
        pool.map(ChildFun,myList)

Normally, I would expect to only see

Processing the index 2
Processing the index 0
Processing the index 3
Processing the index 1
Processing the index 4

But I get

This should not appear
This should not appear
This should not appear
This should not appear
This should not appear
Processing the index 2
Processing the index 0
Processing the index 3
Processing the index 1
Processing the index 4

How come? Does multiprocessing import all the variables from the main process, even if they are protected by if __name__ == "__main__":? Or does it just search in the main process variables that it did not find in the child processes?

VicN
  • 311
  • 2
  • 7

1 Answers1

0

It seems that it is because of how Unix handles forks, as explained here:Python multiprocessing--global variables in separate processes sharing id?

My guess is that using initilizer is cleaner (you know what you pass), and more explicit for sharing read-only variables. But mostly, it is probably the way to make your code work on other plateforms (typically on Windows), which don't have the same forking mechanism.

VicN
  • 311
  • 2
  • 7