When you do multiprocessing under a platform that uses the spawn method to create new processes, then any code at global scope that is not within a if __name__ == '__main__':
block will first be executed by the child process in order to initialize its storage prior to invoking the worker function f1
.
In your posted code, when the child process is created it will therefore execute the following statements in order:
from multiprocessing import *
q = Queue()
# create global queue
def f1(x, q):
# create function definition
def main_f():
# create function definition
main_f()
# call main_f
print('05')
In reality the only statement that needs to be executed by the child process before the worker method f1
is invoked is statement #3 above, which defines the worker function for the child process.
Statement 1 imports a package not used by your child process. Doing this does not prevent the program from running correctly but Python is spending time performing animport that is not used.
Statement #2 needlessly creates a new queue instance in the child process distinct from the one created in the main process. It would be disastrous if your child process used this since it would be putting elements on a different queue than the one the main process is getting from. Fortunately, function f1
is not referencing and using the queue that is passed as an argument.
Statement #4 defines a function not used by the child process. It doesn't prevent the program from running but is wasteful.
Statement #5 invokes main_f
. This is where your troubles begins. All the code within main_f
that is not within a if __name__ == '__main__':
block will get executed immediately before your worker function is invoked. This is what is causing an extra '01' to be printed.
Statement #6 likewise is what is causing an extra '05' to be printed.
At the minimum to get your program working correctly, your code should therefore be:
from multiprocessing import *
def f1(x,q):
print('03')
x = x + " world"
q.put(x)
def main_f():
# large data/complex use multiprocessing, else use ordinary function
q = Queue() # comm between parent n child process
print('01')
mp = Process(target=f1,args=("hello",q,))
print('02')
mp.start()
print(q.get())
mp.join()
print('04')
if __name__ == '__main__':
main_f()
If we want to eliminate all possible inefficiencies, i.e. prevent unnecessary statements from being executed when the child process is initialized, then:
def f1(x,q):
print('03')
x = x + " world"
q.put(x)
if __name__ == '__main__':
from multiprocessing import *
def main_f():
# large data/complex use multiprocessing, else use ordinary function
q = Queue() # comm between parent n child process
print('01')
mp = Process(target=f1,args=("hello",q,))
print('02')
mp.start()
print(q.get())
mp.join()
print('04')
main_f()