6

So I'm trying to implement multiprocessing in python where I wish to have a Pool of 4-5 processes running a method in parallel. The purpose of this is to run a total of thousand Monte simulations (250-200 simulations per process) instead of running 1000. I want each process to write to a common shared array by acquiring a lock on it as soon as its done processing the result for one simulation, writing the result and releasing the lock. So it should be a three step process :

  1. Acquire lock
  2. Write result
  3. Release lock for other processes waiting to write to array.

Everytime I pass the array to the processes each process creates a copy of that array which I donot want as I want a common array. Can anyone help me with this by providing sample code?

Delgan
  • 18,571
  • 11
  • 90
  • 141
Vedant7
  • 71
  • 1
  • 1
  • 6
  • You have tons of sample code if you search some answers. So, if you don't provide some of yours, we won't be able to help you. Btw, use [semaphores](https://docs.python.org/2/library/threading.html#semaphore-objects) to lock the threads – Raskayu Aug 24 '16 at 11:38
  • What's unclear about the [examples](https://docs.python.org/3/library/multiprocessing.html#sharing-state-between-processes) in the official documentation? – mata Aug 24 '16 at 11:45
  • Do you need to access the array whilst the simulations are ongoing? If not, you can just use the set of `Pool.map` functions. – Dunes Aug 24 '16 at 12:08

2 Answers2

3

Since you're only returning state from the child process to the parent process, then using a shared array and explicity locks is overkill. You can use Pool.map or Pool.starmap to accomplish exactly what you need. For example:

from multiprocessing import Pool

class Adder:
    """I'm using this class in place of a monte carlo simulator"""

    def add(self, a, b):
        return a + b

def setup(x, y, z):
    """Sets up the worker processes of the pool. 
    Here, x, y, and z would be your global settings. They are only included
    as an example of how to pass args to setup. In this program they would
    be "some arg", "another" and 2
    """
    global adder
    adder = Adder()

def job(a, b):
    """wrapper function to start the job in the child process"""
    return adder.add(a, b)

if __name__ == "__main__":   
    args = list(zip(range(10), range(10, 20)))
    # args == [(0, 10), (1, 11), ..., (8, 18), (9, 19)]

    with Pool(initializer=setup, initargs=["some arg", "another", 2]) as pool:
        # runs jobs in parallel and returns when all are complete
        results = pool.starmap(job, args)

    print(results) # prints [10, 12, ..., 26, 28] 
Dunes
  • 37,291
  • 7
  • 81
  • 97
0

Not tested, but something like that should work. The array and lock are shared between processes.

from multiprocessing import Process, Array, Lock

def f(array, lock, n): #n is the dedicated location in the array
    lock.acquire()
    array[n]=-array[n]
    lock.release()

if __name__ == '__main__':
    size=100
    arr=Array('i', [3,-7])
    lock=Lock()
    p = Process(target=f, args=(arr,lock,0))
    q = Process(target=f, args=(arr,lock,1))
    p.start()
    q.start()
    q.join()
    p.join()

    print(arr[:])

the documentation here https://docs.python.org/3.5/library/multiprocessing.html has plenty of examples to start with

Rémi Baudoux
  • 542
  • 3
  • 16
Julien
  • 1,810
  • 1
  • 16
  • 34
  • That is _not_ a shared array. Changing it in a subprocess won't have any effect in the parent. – mata Aug 24 '16 at 11:48
  • 5
    Also, an [Array](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Array) is created by default with a lock attached which you can get using its `get_lock()` method, no need to allocate it explicitly unless you want to use a different lock type. The body could then become just `with array.get_lock(): ...` – mata Aug 24 '16 at 12:02
  • @mata I need some clarification. Do you mean that I can change the function f to just "def f(array): with array.get_lock(): #modify array here# " and there is no need to create a lock object and explicitly pass it? – RodrikTheReader Oct 26 '17 at 06:45
  • @RodrikTheReader yes this is what mata explains: https://docs.python.org/2/library/multiprocessing.html#shared-ctypes-objects – Julien Oct 26 '17 at 06:55
  • @Julien Thanks. I am actually struggling with a similar shared memory problem as posted in the question by OP. I need 4 separate arrays that can be shared by multiple processes. I've defined the shared array in a class and I'm passing the objects of that class to a function which writes on the shared arrays. I'm spawning multiple processes and the target is this function. Anyway, the program is not working as I expect. Do you think instead of creating objects, I should just create the shared arrays in the main program and pass them to the function? – RodrikTheReader Oct 26 '17 at 07:08
  • @RodrikTheReader sorry but I do not have enough information. You may continue to search a bit by yourself and then ask a new question. – Julien Oct 26 '17 at 07:18
  • @Julien No problem – RodrikTheReader Oct 26 '17 at 07:20