1

Why is the following code (you can run it here):

from multiprocessing.pool import Pool
from time import sleep

MY_LIST = []


def worker(j):
    global MY_LIST
    sleep(1)
    MY_LIST.append(j)
    print(j, len(MY_LIST))


if __name__ == '__main__':
    parameters= [1, 2, 3, 4]
    with Pool(processes=2) as pool:
        results = pool.map(worker, parameters)

prints:

1 1
2 1
3 2
4 2

And if I comment out the sleep(1) part, it prints what I expect:

1 1
2 2
3 3
4 4

How can I fix the code so that it correctly appends and in the end I have 4 elements in MY_LIST

Caner
  • 57,267
  • 35
  • 174
  • 180
  • Curiously, this code won't execute for me. I get `AttributeError: Can't get attribute 'worker' on ` – Sam Morgan Mar 30 '20 at 00:24
  • @SamMorgan try this: https://repl.it/@CP8/PhonyGlossyKnowledge – Caner Mar 30 '20 at 00:26
  • What is your goal? You probably need to use some synchronization mechanism when it comes to have several workers in parallel. – dcg Mar 30 '20 at 01:10
  • Looks like a timing issue. `pool.map` splits the iterable into chunks and submits these as individual tasks to the workers. My guess is that in the version without the `sleep` one of the workers quickly grabs all four tasks and executes them whereas the version with the `sleep` allows for both workers to pick up two of the tasks each. You can `print(multiprocessing.current_process().name, j, len(MY_LIST))` in the `worker` function to see if that is the case. – shmee Mar 30 '20 at 02:59

2 Answers2

1

With mutliprocessing, you would need to use a shared variable in order to access it from multiple processes. In your code, without the sleep, each process finishes fast enough so that you only need to use one process in the pool and so your code appends to the same MY_LIST. When you use a sleep, it uses more than one process and so you end up appending to different instances of MY_LIST.

You will need to use a shared variable in order to access it from each process. The shared variable would need a preallocated size:

from multiprocessing import pool, Array, current_process

MY_LIST = Array('i', 8)

def worker(j):
    MY_LIST[j-1] = j*j
    print(current_process().name, j, len(MY_LIST))

if __name__ == '__main__':
    parameters= [1, 2, 3, 4, 5, 6, 7, 8]
    with pool.Pool(processes=4) as pool:
        results = pool.map(worker, parameters)
    for i in MY_LIST:
        print(i)

Output:

ForkPoolWorker-1 1 8
ForkPoolWorker-2 2 8
ForkPoolWorker-1 4 8
ForkPoolWorker-3 3 8
ForkPoolWorker-1 5 8
ForkPoolWorker-2 6 8
ForkPoolWorker-1 7 8
ForkPoolWorker-3 8 8
1
4
9
16
25
36
49
64
imran
  • 1,560
  • 10
  • 8
1

There is a race which is revealed by the call to time.sleep because without it one of the workers finishes the task before any of the rest can even start. This is made clear if you use a longer list, for example:

from multiprocessing.pool import Pool
from time import sleep

MY_LIST = []

def worker(j):
    global MY_LIST
    sleep(1)
    MY_LIST.append(j)
    print(j, len(MY_LIST))

if __name__ == '__main__':
    parameters = range(25)
    with Pool(processes=2) as pool:
        results = pool.map(worker, parameters)

Will output something like

1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
1 10
1 11
1 12
1 13
1 14
1 15
1 16
1 17
1 18
1 19
1 20
1 21
1 1
1 2
1 3
1 4

Of course that's not the only problem. The other issue is that global variables are not shared between processes, at least not the primitive python types. You need to use a multiprocessing.Manager.List or similar, something like this:

from multiprocessing import Pool, Lock, Manager
from time import sleep

manager = Manager()
MY_LIST = manager.list()

def worker(j):
    global MY_LIST, lock
    MY_LIST.append(j)
    print(j, len(MY_LIST))


if __name__ == '__main__':
    parameters = range(25)
    with Pool(2) as pool:
        results = pool.map(worker, parameters)

Which will output

4 1
5 2
6 3
7 4
8 5
9 6
10 7
11 8
12 10
13 11
0 10
1 13
14 12
2 15
15 15
3 16
16 18
17 19
18 20
20 20
19 22
24 23
21 23
22 24
23 25

Of course this does nothing to guarantee order, only access.

William Miller
  • 9,839
  • 3
  • 25
  • 46