Is it efficient to calculate many results in parallel with multiprocessing.Pool.map()
in a situation where each input value is large (say 500 MB), but where input values general contain the same large object? I am afraid that the way multiprocessing
works is by sending a pickled version of each input value to each worker process in the pool. If no optimization is performed, this would mean sending a lot of data for each input value in map()
. Is this the case? I quickly had a look at the multiprocessing
code but did not find anything obvious.
More generally, what simple parallelization strategy would you recommend so as to do a map()
on say 10,000 values, each of them being a tuple (vector, very_large_matrix)
, where the vectors are always different, but where there are say only 5 different very large matrices?
PS: the big input matrices actually appear "progressively": 2,000 vectors are first sent along with the first matrix, then 2,000 vectors are sent with the second matrix, etc.