As we all know we need to protect the main()
when running code with multiprocessing
in Python using if __name__ == '__main__'
.
I understand that this is necessary in some cases to give access to functions defined in the main but I do not understand why this is necessary in this case:
file2.py
import numpy as np
from multiprocessing import Pool
class Something(object):
def get_image(self):
return np.random.rand(64,64)
def mp(self):
image = self.get_image()
p = Pool(2)
res1 = p.apply_async(np.sum, (image,))
res2 = p.apply_async(np.mean, (image,))
print(res1.get())
print(res2.get())
p.close()
p.join()
main.py
from file2 import Something
s = Something()
s.mp()
All of the functions or imports necessary for Something
to work are part of file2.py
. Why does the subprocess need to re-run the main.py
?
I think the __name__
solution is not very nice as this prevents me from distribution the code of file2.py
as I can't make sure they are protecting their main.
Isn't there a workaround for Windows?
How are packages solving that (as I never encountered any problem not protecting my main with any package - are they just not using multiprocessing?)
edit:
I know that this is because of the fork()
not implemented in Windows. I was just asking if there is a hack to let the interpreter start at file2.py
instead of main.py
as I can be sure that file2.py
is self-sufficient