50

I need to append objects to one list L from different processes using multiprocessing , but it returns empty list. How can I let many processes append to list L using multiprocessing?

#!/usr/bin/python
from multiprocessing import Process

L=[]
def dothing(i,j):
    L.append("anything")
    print i

if __name__ == "__main__":
    processes=[]
    for i in range(5):
        p=Process(target=dothing,args=(i,None))
        p.start()
        processes.append(p)
    for p in processes:
        p.join()

print L
martineau
  • 119,623
  • 25
  • 170
  • 301
Adam
  • 669
  • 2
  • 6
  • 8

3 Answers3

64

Global variables are not shared between processes.

You need to use multiprocessing.Manager.list:

from multiprocessing import Process, Manager

def dothing(L, i):  # the managed list `L` passed explicitly.
    L.append("anything")

if __name__ == "__main__":
    with Manager() as manager:
        L = manager.list()  # <-- can be shared between processes.
        processes = []
        for i in range(5):
            p = Process(target=dothing, args=(L,i))  # Passing the list
            p.start()
            processes.append(p)
        for p in processes:
            p.join()
        print L

See Sharing state between processes¶ (Server process part).

falsetru
  • 357,413
  • 63
  • 732
  • 636
  • 1
    This is printing empty. Something changed with Python 3.? – Tronald Dump May 17 '18 at 19:04
  • 1
    @TronaldDump, I tested in Python 2.7 / 3.6 (in Ubuntu 18.04). It just works for me. https://i.imgur.com/FX2GmRp.png – falsetru May 17 '18 at 23:49
  • 1
    I was using Jupyter Notebook and it wasn't working. But worked in Spyder though. – Tronald Dump May 17 '18 at 23:54
  • @TronaldDump, How about post a separated question with `jupyter` tag? – falsetru May 18 '18 at 00:22
  • @falsetru I modified your code by appending i instead of "anything" and sleep(4) after that inside the dothing function. The results shows the list is not appended in order. I ran it for 50 numbers and the results is like this. [0, 1, 2, 3, 4, 5, 7, 6, 9, 8, 10, 14, 11, 17, 16, 15, 12, 19, 21, 18, 13, 20, 23, 24, 25, 22, 27, 26, 29, 32, 30, 28, 43, 34, 33, 31, 36, 40, 37, 38, 41, 42, 39, 35, 45, 44, 46, 48, 47, 49]. Is there a way to preserve the order? – eSadr May 23 '18 at 19:16
  • @falsetru noob question: does this code automatically runs on all the available cores? – shamalaia Nov 25 '20 at 10:30
  • 1
    @shamalaia, Change `dothing` function content with never ending loop. Then, you will see it burns you processors.(5 cpus because of `range(5)` => run 5 processes) – falsetru Nov 25 '20 at 10:39
  • After running into the same problem, I ended up on this thread and got out of bed at dawn to test it. You made my day, I'll sleep better now. Thank you! – Jose Velasco Jun 02 '21 at 03:13
  • May I ask - if the OP had passed the list as an argument (rather than using global variable approach), then would it work correctly? I this case would each child process have the reference (pointer) to the original list? – variable Jun 24 '21 at 06:27
  • @variable, I don't understand what you are asking. Please post a separate question with example so that others can answer you. – falsetru Jun 24 '21 at 10:14
  • I meant to ask that if the OP had passed L into the child process, then does the parent and child process share the same L list address? – variable Jun 24 '21 at 10:15
  • @variable, How OP pass L into the child process? as a parameter/argument or as global variable? -> No. – falsetru Jun 24 '21 at 13:38
  • 1
    @falsetru Doesn't this approach suffer from false sharing? Is there a reason why you didn't use a mutex? I cant find in documentation if the manager provides a synchronization or not – Asil Sep 17 '22 at 15:29
  • @Asil, I don't know what do you mean by "false sharing". Personally I don't prefer to use mutex like lock. IMO, for this task, [using process/thread pool](https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool) looks better approach which doesn't require direct usage of locks. – falsetru Sep 17 '22 at 15:52
  • @falsetru false sharing is a very well-known multi-processing usage pattern and the reason why mutex is needed. Regarding pools, I again can't see any hints on how python handles it and still think I must use mutex with the process pool as well. But thanks for your answer and for responding to my comment on this old post. – Asil Sep 17 '22 at 23:55
9

Falsetru's answer worked.

But still, the list was not accessible beyond the with Manager() as manager: two changes were needed:

  1. adding L = [] in front of the if __name__ == "__main__": statement. Must be added as for some reason the last print(L) (the one outside of if) is executed Processes + 1 times. This returns an error that L is not defined and the code breaks.

  2. adding L = list(L)after the p.join() statement. This step is needed to change Manager.list to regular Python.list - otherwise calls to the manager.list return errors that object not readable.

from multiprocessing import Process, Manager

def dothing(L, i):  # the managed list `L` passed explicitly.
    for j in range(5):
        text = "Process " + str(i) + ", Element " + str(j)
        L.append(text)

L = [] 

if __name__ == "__main__":
    with Manager() as manager:
        L = manager.list()  # <-- can be shared between processes.
        processes = []

        for i in range(5):
            p = Process(target=dothing, args=(L,i,))  # Passing the list
            p.start()
            processes.append(p)

        for p in processes:
            p.join()

        L = list(L) 
        print("Within WITH")
        print(L)

    print("Within IF")
    print(L)

print("Outside of IF")
print(L)

Output:

Outside of IF
[]
Outside of IF
[]
Outside of IF
[]
Outside of IF
[]
Outside of IF
[]
Outside of IF
[]
Within WITH
['Process 2, Element 0','Process 2, Element 1', 'Process 2, Element 2',
'Process 2, Element 3', 'Process 2, Element 4', 'Process 1, Element 0', 
'Process 1, Element 1', 'Process 1, Element 2', 'Process 1, Element 3', 
'Process 1, Element 4', 'Process 0, Element 0', 'Process 0, Element 1', 
'Process 0, Element 2', 'Process 0, Element 3', 'Process 0, Element 4', 
'Process 4, Element 0', 'Process 4, Element 1', 'Process 4, Element 2', 
'Process 4, Element 3', 'Process 4, Element 4', 'Process 3, Element 0', 
'Process 3, Element 1', 'Process 3, Element 2', 'Process 3, Element 3', 
'Process 3, Element 4']

Within IF
['Process 2, Element 0','Process 2, Element 1', 'Process 2, Element 2', 
'Process 2, Element 3', 'Process 2, Element 4', 'Process 1, Element 0', 
'Process 1, Element 1', 'Process 1, Element 2', 'Process 1, Element 3', 
'Process 1, Element 4', 'Process 0, Element 0', 'Process 0, Element 1', 
'Process 0, Element 2', 'Process 0, Element 3', 'Process 0, Element 4', 
'Process 4, Element 0', 'Process 4, Element 1', 'Process 4, Element 2',
'Process 4, Element 3', 'Process 4, Element 4', 'Process 3, Element 0', 
'Process 3, Element 1', 'Process 3, Element 2', 'Process 3, Element 3', 
'Process 3, Element 4']

Outside of IF
['Process 2, Element 0','Process 2, Element 1', 'Process 2, Element 2', 
'Process 2, Element 3', 'Process 2, Element 4', 'Process 1, Element 0', 
'Process 1, Element 1', 'Process 1, Element 2', 'Process 1, Element 3', 
'Process 1, Element 4', 'Process 0, Element 0', 'Process 0, Element 1', 
'Process 0, Element 2', 'Process 0, Element 3', 'Process 0, Element 4', 
'Process 4, Element 0', 'Process 4, Element 1', 'Process 4, Element 2', 
'Process 4, Element 3', 'Process 4, Element 4', 'Process 3, Element 0', 
'Process 3, Element 1', 'Process 3, Element 2', 'Process 3, Element 3', 
'Process 3, Element 4']
sebtac
  • 538
  • 5
  • 8
2

Thanks to @falsetru for suggesting the exact documentation and providing the good code. I need to keep the order for my application and by modifying the @falsetru code, now the below code preserves the order of adding items to the list.

The sleep is helpful to catch the bugs otherwise it is hard to catch the problem with ordering of the list.

from multiprocessing import Process, Manager
from time import sleep

def dothing(L, i):  # the managed list `L` passed explicitly.
    L[i]= i
    sleep(4)

if __name__ == "__main__":
    with Manager() as manager:
        L = manager.list(range(50))  # <-- can be shared between processes.
        processes = []
        for i in range(50):
            p = Process(target=dothing, args=(L,i))  # Passing the list
            p.start()
            processes.append(p)
        for p in processes:
            p.join()
        print(L)
eSadr
  • 395
  • 5
  • 21