2

Thanks to How to run functions in parallel? the following code works.

import time
from multiprocessing import Process


def worker():
    time.sleep(2)
    print("Working")


def runInParallel(*fns):
    proc = []
    for fn in fns:
        p = Process(target=fn)
        p.start()
        proc.append(p)
    for p in proc:
        p.join()


if __name__ == '__main__':
    start = time.time()
    runInParallel(worker, worker, worker, worker)
    print("Total time taken: ", time.time()-start)

However if I add argument to worker() it does not run in parallel anymore.

import time
from multiprocessing import Process

def worker(ii):
    time.sleep(ii)
    print("Working")

def runInParallel(*fns):
    proc = []
    for fn in fns:
        p = Process(target=fn)
        p.start()
        proc.append(p)
    for p in proc:
        p.join()

if __name__ == '__main__':
    start = time.time()
    runInParallel(worker(2), worker(2), worker(2), worker(2))
    print("Total time taken: ", time.time()-start)

What might be the reason for that?

Ahmad Ismail
  • 11,636
  • 6
  • 52
  • 87
  • 1
    I don’t know the full answer, but your main issue is that you’ve changed the function *references* to function *calls*, and so you’re calling your worker before `runInParallel` even starts. I think you need another argument from `Process`. – dantiston Oct 07 '20 at 14:12

4 Answers4

2

It's because of the difference between worker and worker(). The first is the function, and the latter is a function call. What is happening on the line runInParallel(worker(2), worker(2), worker(2), worker(2)) is that all four calls are run before the execution of runInParallel is even begun. If you add a print(fns) in beginning of runInParallel you will see some difference.

Quick fix:

def worker_caller():
    worker(2)

and:

runInParallel(worker_caller, worker_caller, worker_caller, worker_caller)

That's not very convenient but it's mostly intended to show what the problem is. The problem is not in the function worker. The problem is that you're mixing up passing a function and passing a function call. If you changed your first version to:

runInParallel(worker(), worker(), worker(), worker())

then you would run into exactly the same issue.

But you can do this:

runInParallel(lambda:worker(2), lambda: worker(2), lambda: worker(2), lambda: worker(2))

Lambdas are very useful. Here is another version:

a = lambda:worker(2)
b = lambda:worker(4)
c = lambda:worker(3)
d = lambda:worker(1)

runInParallel(a, b, c, d)
klutt
  • 30,332
  • 17
  • 55
  • 95
2

You should modify runInParallel to do iterable unpacking.

import time
from multiprocessing import Process

def worker(ii):
    time.sleep(ii)
    print("Working")

def runInParallel(*fns):
    proc = []
    for fn in fns:
        func, *args = fn
        p = Process(target=func, args=args)
        p.start()
        proc.append(p)
    for p in proc:
        p.join()

if __name__ == '__main__':
    start = time.time()
    runInParallel((worker, 2), (worker, 3), (worker, 5), (worker, 2))
    print("Total time taken: ", time.time()-start)
gold_cy
  • 13,648
  • 3
  • 23
  • 45
  • 2
    Although wasn't asked for this case, probably in generic processing function would be nice to support any number of positional arguments like `func, *args = fn; p = Process(target = func, args = args)` – Arty Oct 07 '20 at 14:22
1

To pass arguments, you need to pass them to the Process constructor:

        p = Process(target=fn, args=(arg1,))
Jiří Baum
  • 6,697
  • 2
  • 17
  • 17
0

The Process constructor accepts args and kwargs parameters, which are then passed to the process when it is executed. The documentation is quite clear about this.

So your code should be modified something like this:

def worker(ii):
    time.sleep(ii)
    print("Working")

def runInParallel(*fns):
    proc = []
    for fn in fns:
        p = Process(target=fn, args=(2,))
        p.start()
        proc.append(p)
    for p in proc:
        p.join()

if __name__ == '__main__':
    start = time.time()
    runInParallel(worker, worker, worker, worker)
    print("Total time taken: ", time.time()-start)

Of course parameters can be different for each process, you need to arrange that the right one is passed to each in args (or kwargs for keyword parameters). This can be achieved by passing tuples such as runInParallel((worker,2), (worker,3), (worker,5), (worker,1) for example, and then processing the tuples inside runInParallel.

xxa
  • 1,258
  • 9
  • 20