5

I am using Python's multiprocessing to create a parallel application. Processes need to share some data, for which I use a Manager. However, I have some common functions which processes need to call and which need to access the data stored by the Manager object. My question is whether I can avoid needing to pass the Manager instance to these common functions as an argument and rather use it like a global. In other words, consider the following code:

import multiprocessing as mp

manager = mp.Manager()
global_dict = manager.dict(a=[0])

def add():
    global_dict['a'] += [global_dict['a'][-1]+1]

def foo_parallel(var):
    add()
    print var

num_processes = 5
p = []
for i in range(num_processes):
    p.append(mp.Process(target=foo_parallel,args=(global_dict,)))

[pi.start() for pi in p]
[pi.join() for pi in p]

This runs fine and returns p=[0,1,2,3,4,5] on my machine. However, is this "good form"? Is this a good way to doing it, just as good as defining add(var) and calling add(var) instead?

space_voyager
  • 1,984
  • 3
  • 20
  • 31

1 Answers1

5

Your code example seems to have bigger problems than form. You get your desired output only with luck. Repeated execution will yield different results. That's because += is not an atomic operation. Multiple processes can read the same old value one after another, before any of them has updated it and they will write back the same values. To prevent this behaviour, you'll have to use a Manager.Lock additionally.


To your original question about "good form".

IMO it would be cleaner, to let the main-function of the child process foo_parallel, pass global_dict explicitly into a generic function add(var). That would be a form of dependency injection and has some advantages. In your example non-exhaustively:

  • allows isolated testing

  • increases code reusability

  • easier debugging (detecting non-accessibility of the managed object shouldn't be delayed until addis called (fail fast)

  • less boilerplate code (for example try-excepts blocks on resources multiple functions need)

As a side note. Using list comprehensions only for it's side effects is considered a 'code smell'. If you don't need a list as result, just use a for-loop.

Code:

import os
from multiprocessing import Process, Manager


def add(l):
    l += [l[-1] + 1]
    return l


def foo_parallel(global_dict, lock):
    with lock:
        l = global_dict['a']
        global_dict['a'] = add(l)
        print(os.getpid(), global_dict)


if __name__ == '__main__':

    N_WORKERS = 5

    with Manager() as manager:

        lock = manager.Lock()
        global_dict = manager.dict(a=[0])

        pool = [Process(target=foo_parallel, args=(global_dict, lock))
                for _ in range(N_WORKERS)]

        for p in pool:
            p.start()

        for p in pool:
            p.join()

        print('result', global_dict)
Community
  • 1
  • 1
Darkonaut
  • 20,186
  • 7
  • 54
  • 65
  • Dear Darkonaut, regarding our commet of "the atomic operation (+= operation)", shouldn't your example also apply a solution for it? For instance, in your "add" function, I believe that the variable "l" could be updated as follows: "l.append([l[-1] + 1]). Am I wrong in assuming that? Sincerely, – Philipe Riskalla Leal Jun 24 '22 at 01:05
  • As it seems, one can not do that I suggested above. I have justed tested that, and an "TypeError: can only concatenate list (not "int") to list" appeared. Anyone can answer why? A solution to this error seems to be "l = l + [l[-1] + 1]", though. – Philipe Riskalla Leal Jun 24 '22 at 01:09
  • Dear @PhilipeRiskallaLeal, you get this error because, while you are already appending, you also add another nested list when you write `l.append([l[-1] + 1])`. It should be `l.append(l[-1] + 1)` instead and this would be indeed nicer than using `+=`, but it still is not atomic since it first reads `l[-1]` and only then appends. My solution above resolves the _inter_-process race condition with using a Lock for read & update, rendering the whole operation atomic on an inter-process level. – Darkonaut Jun 24 '22 at 03:04
  • I see now my mistake. Thank you Darkonaut. – Philipe Riskalla Leal Jun 24 '22 at 13:07