From this question I learned that:
When you use multiprocessing to open a second process, an entirely new instance of Python, with its own global state, is created. That global state is not shared, so changes made by child processes to global variables will be invisible to the parent process.
To verify this behavior, I made a test script:
import time
import multiprocessing as mp
from multiprocessing import Pool
x = [0] # global
def worker(c):
if c == 1: # wait for proc 2 to finish; is global x overwritten by now?
time.sleep(2)
print('enter: x =', x, 'with id', id(x), 'in proc', mp.current_process())
x[0] = c
print('exit: x =', x, 'with id', id(x), 'in proc', mp.current_process())
return x[0]
pool = Pool(processes=2)
x_vals = pool.map(worker, [1, 2])
print('parent: x =', x, 'with id', id(x), 'in proc', mp.current_process())
print('final output', x_vals)
The output (on CPython) is something like
enter: x = [0] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-2, started daemon)>
exit: x = [2] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-2, started daemon)>
enter: x = [0] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-1, started daemon)>
exit: x = [1] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-1, started daemon)>
parent: x = [0] with id 140138406834504 in proc <_MainProcess(MainProcess, started)>
final output [1, 2]
How should I explain the fact that the id
of x
is shared in all the processes, yet x
takes different values? Isn't id
conceptually the memory address of a Python object?
I guess this is possible if the memory space gets cloned in the child processes. Then is there something I can use to get the actual physical memory address of a Python object?