2

I have recently started using multiprocessing in python and I have the following code to update the list items from multiple processes. But it is giving empty list.

from multiprocessing import Pool
import time

global_list = list()


def testfun(n):
    print('started ', n)
    time.sleep(1)
    global_list.append(n)
    print('completed ', n)


def call_multiprocessing_function():
    mytasks = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n']
    with Pool() as pool:
        pool.map(testfun, mytasks)


if __name__ == "__main__":
    print('starting the script')

    print(global_list)
    call_multiprocessing_function()
    print(global_list)

    print('completed the script')

I am getting the following output

starting the script
[]
started  a
started  b
started  c
started  d
completed  a
started  e
completed  b
started  f
completed  c
started  g
completed  d
started  h
completed  e
started  i
completed  f
started  j
completed  g
started  k
completed  h
started  l
completed  i
started  m
completed  j
started  n
completed  k
completed  l
completed  m
completed  n
[]
completed the script

The result list is coming as empty. Is there a way to have a common variable to be shared across all these processes to store the data. How can we achieve this functionality using multiprocessing?

vks
  • 67,027
  • 10
  • 91
  • 124
newbie
  • 1,282
  • 3
  • 20
  • 43

1 Answers1

5

Processes do not share memory.So you need to use Manager.list

import time    
from multiprocessing import Pool, Manager

m=Manager()
global_list = m.list()


def testfun(n):
    print('started ', n)
    time.sleep(1)
    global_list.append(n)
    print('completed ', n)


def call_multiprocessing_function():
    mytasks = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n']
    p=Pool()
    p.map(testfun, mytasks)


if __name__ == "__main__":
    print('starting the script')

    print(global_list)
    call_multiprocessing_function()
    print(global_list)

    print('completed the script')

Output:

starting the script
[]
started  a
started  b
started  c
started  d
started  e
started  f
started  g
started  h
completed  e
started  i
completed  f
started  j
completed  d
started  k
completed  a
started  l
completed  g
started  m
completed  b
completed  c
started  n
completed  h
completed  i
completed  j
completed  k
completed  l
completed  n
completed  m
['e', 'f', 'd', 'a', 'g', 'b', 'c', 'h', 'i', 'j', 'k', 'l', 'n', 'm']
completed the script
cs95
  • 379,657
  • 97
  • 704
  • 746
vks
  • 67,027
  • 10
  • 91
  • 124
  • Use python3, that's what OP's using. – cs95 Mar 15 '18 at 05:51
  • 1
    No, they were statements and were printing out tuples. I fixed the output for you by running your code on my machine. No other difference to report here. This will work on python2 out of the box. – cs95 Mar 15 '18 at 05:55
  • Oh that is awesome! But I'm gettting below error. RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. – newbie Mar 15 '18 at 06:00
  • 1
    I think this error is specific to windows platform i think. I am using windows 10 (source: https://stackoverflow.com/questions/18204782/runtimeerror-on-windows-trying-python-multiprocessing) – newbie Mar 15 '18 at 06:11