The reason that the new item appended to d[1]
is not printed is stated in Python's official documentation:
Modifications to mutable values or items in dict and list proxies will
not be propagated through the manager, because the proxy has no way of
knowing when its values or items are modified. To modify such an item,
you can re-assign the modified object to the container proxy.
Therefore, this is actually what happens:
from multiprocessing import Process, Manager
manager = Manager()
d = manager.dict()
def f():
# invoke d.__getitem__(), returning a local copy of the empty list assigned by the main process,
# (consider that a KeyError exception wasn't raised, so a list was definitely returned),
# and append 4 to it, however this change is not propagated through the manager,
# as it's performed on an ordinary list with which the manager has no interaction
d[1].append(4)
# convert d to string via d.__str__() (see https://docs.python.org/2/reference/datamodel.html#object.__str__),
# returning the "remote" string representation of the object (see https://docs.python.org/2/library/multiprocessing.html#multiprocessing.managers.SyncManager.list),
# to which the change above was not propagated
print d
if __name__ == '__main__':
# invoke d.__setitem__(), propagating this assignment (mapping 1 to an empty list) through the manager
d[1] = []
p = Process(target=f)
p.start()
p.join()
Reassigning d[1]
with a new list, or even with the same list once again, after it was updated, triggers the manager to propagate the change:
from multiprocessing import Process, Manager
manager = Manager()
d = manager.dict()
def f():
# perform the exact same steps, as explained in the comments to the previous code snippet above,
# but in addition, invoke d.__setitem__() with the changed item in order to propagate the change
l = d[1]
l.append(4)
d[1] = l
print d
if __name__ == '__main__':
d[1] = []
p = Process(target=f)
p.start()
p.join()
The line d[1] += [4]
would have worked as well.
EDIT for Python 3.6 or later:
Since Python 3.6, per this changeset following this issue, it's also possible to use nested Proxy Objects which automatically propagate any changes performed on them to the containing Proxy Object. Thus, replacing the line d[1] = []
with d[1] = manager.list()
would correct the issue as well:
from multiprocessing import Process, Manager
manager = Manager()
d = manager.dict()
def f():
d[1].append(4)
# the __str__() method of a dict object invokes __repr__() on each of its items,
# so explicitly invoking __str__() is required in order to print the actual list items
print({k: str(v) for k, v in d.items()})
if __name__ == '__main__':
d[1] = manager.list()
p = Process(target=f)
p.start()
p.join()
Unfortunately, this bug fix was not ported to Python 2.7 (as of Python 2.7.13).
NOTE (running under the Windows operating system):
Although the described behaviour applies to the Windows operating system as well, the attached code snippets would fail when executed under Windows due to the different process creation mechanism, that relies on the CreateProcess()
API rather than the fork()
system call, which isn't supported.
Whenever a new process is created via the multiprocessing module, Windows creates a fresh Python interpreter process that imports the main module, with potentially hazardous side effects. In order to circumvent this issue, the following programming guideline is recommended:
Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).
Therefore, executing the attached code snippets as is under Windows would try to create an infinite number of processes due to the manager = Manager()
line. This can be easily fixed by creating the Manager
and Manager.dict
objects inside the if __name__ == '__main__'
clause and passing the Manager.dict
object as an argument to f()
, as done in this answer.
More details on the issue may be found in this answer.