20

I have written a program like this:

from multiprocessing import Process, Manager

def worker(i):
    x[i].append(i)

if __name__ == '__main__':
    manager = Manager()
    x = manager.list()
    for i in range(5):
        x.append([])
    p = []
    for i in range(5):
        p.append(Process(target=worker, args=(i,)))
        p[i].start()

    for i in range(5):
        p[i].join()

    print x

I want to create a shared list of lists among processes and each process modify a list in it. But the result of this program is a list of empty lists: [[],[],[],[],[]].

What's going wrong?

Tim
  • 41,901
  • 18
  • 127
  • 145
Eric Xu
  • 315
  • 1
  • 2
  • 9

1 Answers1

20

I think this is because of quirk in the way Managers are implemented.

If you create two Manager.list objects, and then append one of the lists to the other, the type of the list that you append changes inside the parent list:

>>> type(l)
<class 'multiprocessing.managers.ListProxy'>
>>> type(z)
<class 'multiprocessing.managers.ListProxy'>
>>> l.append(z)
>>> type(l[0])
<class 'list'>   # Not a ListProxy anymore

l[0] and z are not the same object, and don't behave quite the way you'd expect as a result:

>>> l[0].append("hi")
>>> print(z)
[]
>>> z.append("hi again")
>>> print(l[0])
['hi again']

As you can see, changing the nested list doesn't have any effect on the ListProxy object, but changing the ListProxy object does change the nested list. The documentation actually explicitly notes this:

Note

Modifications to mutable values or items in dict and list proxies will not be propagated through the manager, because the proxy has no way of knowing when its values or items are modified. To modify such an item, you can re-assign the modified object to the container proxy:

Digging through the source code, you can see that when you call append on a ListProxy, the append call is actually sent to a manager object via IPC, and then the manager calls append on the shared list. That means that the args to append need to get pickled/unpickled. During the unpickling process, the ListProxy object gets turned into a regular Python list, which is a copy of what the ListProxy was pointing to (aka its referent). This is also noted in the documentation:

An important feature of proxy objects is that they are picklable so they can be passed between processes. Note, however, that if a proxy is sent to the corresponding manager’s process then unpickling it will produce the referent itself. This means, for example, that one shared object can contain a second

So, going back to the example above, if l[0] is a copy of z, why does updating z also update l[0]? Because the copy also gets registered with the Proxy object, so, that when you change the ListProxy (z in the example above), it also updates all the registered copies of the list (l[0] in the example above). However, the copy knows nothing about the proxy, so when you change the copy, the Proxy doesn't change.

So, in order to make your example work, you need to create a new manager.list() object every time you want to modify a sublist, and only update that proxy object directly, rather than updating it via the index of the parent list:

#!/usr/bin/python

from multiprocessing import Process, Manager

def worker(x, i, *args):
    sub_l = manager.list(x[i])
    sub_l.append(i)
    x[i] = sub_l


if __name__ == '__main__':
    manager = Manager()
    x = manager.list([[]]*5)
    print x
    p = []
    for i in range(5):
        p.append(Process(target=worker, args=(x, i)))
        p[i].start()

    for i in range(5):
        p[i].join()

    print x

Here's the output:

dan@dantop2:~$ ./multi_weirdness.py 
[[0], [1], [2], [3], [4]]
dano
  • 91,354
  • 19
  • 222
  • 219
  • Thanks for your answer! I intended to change sublist via parent list, but seems I can't. I originally use list, but it slows down my parallel running. Is there any other way to use a shared list that can avoid this problem? – Eric Xu May 13 '14 at 06:32
  • See the example at the end of my answer. It shows how you can work around the issue by creating a new manager.list from the sublist, update the new proxy object, and then insert the proxy back into the parent list. – dano May 13 '14 at 06:51
  • I run into a similar problem but the workaround is not an option as creating too many proxies would take to much time. Is there another workaround? – Davoud Taghawi-Nejad May 12 '17 at 09:15