-2

I'm experimenting with the multiprocessing.Pool functionality, but am running into a weird problem where using pool.map is much slower than just using the normal map function.

I have already looked on SO for previous answers. Any threads I found just create more questions.

Here is my code:

import multiprocessing
from multiprocessing import Pool
import random
import time


def f(x):
    time.sleep(1)
    return x**3


if __name__ == '__main__':

    vector = [1, 2, 3, 4] 

    # Single Process
    start = time.time()
    sp = map(f, vector)
    print(f"Single Process time: {time.time() - start}s")
    print(f"Should equal ~4.00 s")

    print("")

    # Multiprocessing
    p = Pool(processes=4)
    start = time.time()
    with p:
        mp = p.map(f, vector)
    print(f"Multi-Process time: {time.time() - start}s")
    print(f"Should equal ~1.00 s")

Here is the printed output that I get:

Single Process time: 7.152557373046875e-07s
Should equal ~4.00 s

Multi-Process time: 1.1352498531341553s
Should equal ~1.00 s

I originally did something very similar to this thread here: Why is pool.map slower than normal map? I ran into the same problem that the original poster did. So I tried shortening the loop to four, and adding a time.sleep() function to check for execution. This did not work.

Using imap seems to shorten the multiprocessing code by a large amount, but my single core solution still takes less than 1 second (which seems impossible, given that I'm sleeping 1 full second for each element in the array -- it should take at least four seconds!). I've also tried changing the number of cores in the Pool() object. I'm running on a dual core macbook pro with 4 logical cores.

1 Answers1

0

Ahh, just figured it out a minute after posting.

For anybody curious, you need to call the map object in a list function, like so:

sp = map(f, vector)
list(sp)

I think this is because python lazily evaluates the map object.