Multiprocessing - Shared Array

Question

So I'm trying to implement multiprocessing in python where I wish to have a Pool of 4-5 processes running a method in parallel. The purpose of this is to run a total of thousand Monte simulations (250-200 simulations per process) instead of running 1000. I want each process to write to a common shared array by acquiring a lock on it as soon as its done processing the result for one simulation, writing the result and releasing the lock. So it should be a three step process :

Acquire lock
Write result
Release lock for other processes waiting to write to array.

Everytime I pass the array to the processes each process creates a copy of that array which I donot want as I want a common array. Can anyone help me with this by providing sample code?

You have tons of sample code if you search some answers. So, if you don't provide some of yours, we won't be able to help you. Btw, use [semaphores](https://docs.python.org/2/library/threading.html#semaphore-objects) to lock the threads — Raskayu, Aug 24 '16 at 11:38
What's unclear about the [examples](https://docs.python.org/3/library/multiprocessing.html#sharing-state-between-processes) in the official documentation? — mata, Aug 24 '16 at 11:45
Do you need to access the array whilst the simulations are ongoing? If not, you can just use the set of `Pool.map` functions. — Dunes, Aug 24 '16 at 12:08

score 3 · Answer 1 · answered Aug 24 '16 at 12:58

Since you're only returning state from the child process to the parent process, then using a shared array and explicity locks is overkill. You can use Pool.map or Pool.starmap to accomplish exactly what you need. For example:

from multiprocessing import Pool

class Adder:
    """I'm using this class in place of a monte carlo simulator"""

    def add(self, a, b):
        return a + b

def setup(x, y, z):
    """Sets up the worker processes of the pool. 
    Here, x, y, and z would be your global settings. They are only included
    as an example of how to pass args to setup. In this program they would
    be "some arg", "another" and 2
    """
    global adder
    adder = Adder()

def job(a, b):
    """wrapper function to start the job in the child process"""
    return adder.add(a, b)

if __name__ == "__main__":   
    args = list(zip(range(10), range(10, 20)))
    # args == [(0, 10), (1, 11), ..., (8, 18), (9, 19)]

    with Pool(initializer=setup, initargs=["some arg", "another", 2]) as pool:
        # runs jobs in parallel and returns when all are complete
        results = pool.starmap(job, args)

    print(results) # prints [10, 12, ..., 26, 28]

score 0 · Answer 2 · edited Sep 27 '18 at 21:43

0

Not tested, but something like that should work. The array and lock are shared between processes.

from multiprocessing import Process, Array, Lock

def f(array, lock, n): #n is the dedicated location in the array
    lock.acquire()
    array[n]=-array[n]
    lock.release()

if __name__ == '__main__':
    size=100
    arr=Array('i', [3,-7])
    lock=Lock()
    p = Process(target=f, args=(arr,lock,0))
    q = Process(target=f, args=(arr,lock,1))
    p.start()
    q.start()
    q.join()
    p.join()

    print(arr[:])

the documentation here https://docs.python.org/3.5/library/multiprocessing.html has plenty of examples to start with

edited Sep 27 '18 at 21:43

Rémi Baudoux

542
3
16

answered Aug 24 '16 at 11:46

Julien

1,810
1
16
34

That is _not_ a shared array. Changing it in a subprocess won't have any effect in the parent. – mata Aug 24 '16 at 11:48
5

Also, an [Array](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Array) is created by default with a lock attached which you can get using its `get_lock()` method, no need to allocate it explicitly unless you want to use a different lock type. The body could then become just `with array.get_lock(): ...` – mata Aug 24 '16 at 12:02
@mata I need some clarification. Do you mean that I can change the function f to just "def f(array): with array.get_lock(): #modify array here# " and there is no need to create a lock object and explicitly pass it? – RodrikTheReader Oct 26 '17 at 06:45
@RodrikTheReader yes this is what mata explains: https://docs.python.org/2/library/multiprocessing.html#shared-ctypes-objects – Julien Oct 26 '17 at 06:55
@Julien Thanks. I am actually struggling with a similar shared memory problem as posted in the question by OP. I need 4 separate arrays that can be shared by multiple processes. I've defined the shared array in a class and I'm passing the objects of that class to a function which writes on the shared arrays. I'm spawning multiple processes and the target is this function. Anyway, the program is not working as I expect. Do you think instead of creating objects, I should just create the shared arrays in the main program and pass them to the function? – RodrikTheReader Oct 26 '17 at 07:08
@RodrikTheReader sorry but I do not have enough information. You may continue to search a bit by yourself and then ask a new question. – Julien Oct 26 '17 at 07:18
@Julien No problem – RodrikTheReader Oct 26 '17 at 07:20

Multiprocessing - Shared Array

2 Answers2

Linked