0

In this post How to get the return value of a function passed to multiprocessing.Process? there were manny solutions to get a value from the multiprocessing. vartec and Nico Schlömer also mentioned the Sharing state between processes

from multiprocessing import Process, Value, Array

def f(n, a):
    n.value = 3.1415927
    for i in range(len(a)):
        a[i] = -a[i]

if __name__ == '__main__':
    num = Value('d', 0.0)
    arr = Array('i', range(10))

    p = Process(target=f, args=(num, arr))
    p.start()
    p.join()

    print(num.value)
    print(arr[:])

However, the object that's able to be store in Value and Array seemed to be limited, not just a return of any python object. They also mentioned the Manager() class, but I'm not sure how they started the manager class since

return_dict = manager.dict() # never had a statement
return_dict.start() 

In practice, a process desired runs like,

def function(Input):
    Output=computation(Input);)
    return Output;


p1=multiprocessing.Process(target=function,args=(input_1,))
p2=multiprocessing.Process(target=function,args=(input_2,))

p1.start()
p2.start()

p1.join()
p2.join()

or in a while loop. The returned objects

output_1,output_2

may be some complicated objects from the other packages such as the sympy or numpy objects, etc. The main program should just get the raw object return in a list in the order of the processes being started.

[output_1,output_2]

or with a simple label

def function(Input,ix):
    Output=computation(Input);)
    return [ix,Output];

p1=multiprocessing.Process(target=function,args=(input_1,1,))
p2=multiprocessing.Process(target=function,args=(input_2,2,))

[[2,output_2],[1,output_1]]

I thought of defining a list globally, and just append the return to the list. However, I worried that if p1 and p2 finished at the same time, they would try to append to the list at the same time and causing trouble in the memory(could it happen?), or slow down the algorithms. I also saw answers using Queue(). However, that method kind of changed the function itself quite a lot, and function(Input) could not be called normally.

I saw an example with pool,

from multiprocessing import Pool
def job(num):
    return num * 2
if __name__ == '__main__':
    p = Pool(processes=20)
    data = p.map(job, [i for i in range(20)])
    p.close()
    print(data)

which was ridiculously simpler compare to the Process method. Does that mean pool was superior? However, in this case the script intended to use Process instead of pool.

Is there a way to just run the function() with a range of input, and then get the return value, without changing how function() was coded(i.e. function(1)=3.14159265...), with the Process class? What's the simple way to get the return value of a function passed to multiprocessing.Process without using too many other multiprocessing objects?

  • "However, in this case the script intended to use Process instead of pool." I never understand this type of statement. It is as if you have decided in advance what tool to use, before you know what tools are available and what they do. The Process Pool is *exactly* what you are asking for, since it lets you launch any function in another Process and capture the result. I agree that it is much simpler than any other approach. You miss out on the joy of typing `import Process` and calling the Process constructor, but other than that I don't see why you shouldn't simply use a Pool. – Paul Cornelius Mar 28 '23 at 04:22

1 Answers1

2

Multiprocessing and especially passing data between processes can be tricky because each process has its own memory. Therefore, they need special Objects for passing data inbetween them like multiprocessing.Queue or the list from multiprocessing.Manager which handle the IPC stuff in the background. Be aware that "normal" objects like regular queues and lists generally don't work for this kind of application. In general it is a good idea to take a look at the official documentation. There you also find how to initialize a Manager:

from multiprocessing import Manager
manager = Manager()
my_multiprocessing_list = manager.list()

If you like the style of a pool and want the return values, I would suggest you take a look at ProcessPoolExecutor. When queueing new tasks with pool.submit(), you get a Future object back, wich can be used to fetch the result.

from concurrent.futures import ProcessPoolExecutor


def job(num):
    return num * 2


if __name__ == "__main__":
    with ProcessPoolExecutor() as pool:
        futures = []
        for i in range(20):
            futures.append(pool.submit(job, i))

        for f in futures:
            print(f.result())
Baumkuchen
  • 145
  • 8