Multiprocessing Pool worse performance on chunks

Question

I have some strange behavior with Python Multiprocessing Pool performance. In the following code, data is an ndarray of millions of images to be resized, and chunks_list is chunks of data. I use pool = Pool(14). The function resize_images resizes a group of images at once, while resize_image resizes a signle image. The following code:

res = [pool.apply_async(resize_image, args=[img]).get() for img in data]

is faster than this code:

chunks_list = [data[i:i + chunk_size] for i in range(0, len(data), chunk_size)]
res = [pool.apply_async(resize_images, args=[imgs]).get() for imgs in chunks_list]

Why is that? I expected the opposite to be true, because the first code will assign many 'tiny' processes to the pool of CPU's. But chunks will produce less assignations. Is there more efficient way to achieve what I want? (GPU maybe?)

If the images are large you will waste a lot of time simply moving them between processes — BlackBear, Nov 03 '16 at 16:32
@BlackBear This is why I did it in chunks, but was surprised that it's slower. — Hesham Eraqi, Nov 03 '16 at 16:33
@BlackBear Please note that the function resize_images resizes group of images at once unlike resize_image that resize a single image. — Hesham Eraqi, Nov 03 '16 at 16:34
My point was that chunks are larger, so the benefits of processing more of them at once are offset by the cost of copying more images around (but I'm just speculating). Try with shared memory maybe? — BlackBear, Nov 03 '16 at 16:35
Good point. But how can I try shared memory ? I'm still new to Multiprocessing Pool. — Hesham Eraqi, Nov 03 '16 at 16:37
I've never did that myself, maybe you can get away with https://docs.python.org/2/library/multiprocessing.html#sharing-state-between-processes or https://docs.python.org/2/library/mmap.html — BlackBear, Nov 03 '16 at 16:47
May be related: http://stackoverflow.com/questions/15639779/why-does-multiprocessing-use-only-a-single-core-after-i-import-numpy — BlackBear, Nov 04 '16 at 15:57

score -1 · Answer 1 · answered Nov 03 '16 at 17:24

-1

pool.map(preprocess_images, [imgs for imgs in chunks])

would do the job faster than both.

answered Nov 03 '16 at 17:24

Hesham Eraqi

2,444
4
23
45

Multiprocessing Pool worse performance on chunks

1 Answers1