0

I am trying to parallelize a simple function in python as follows:

import numpy as np
import math
import concurrent.futures

def f(x):
    return x * math.sin(x) + x * x * math.cos(x)

xs = np.random.normal(0, 1, 100000)

#This takes about a second
ans1 = map(f, xs)

#This ran about 30 minutes before I gave up
with concurrent.futures.ProcessPoolExecutor() as executor:
    ans2 = executor.map(f, xs)

I understand that this problem is probably too small for parallelization to actually be effective, but I expected the parallelized version of this to take on the order of seconds, not 30+ minutes. What is going wrong here?

Everyone_Else
  • 3,206
  • 4
  • 32
  • 55
  • are you on windows? Note, `map` doesn't actually execute anything, it returns a lazy iterator over your results. Unless you are on Python 2? – juanpa.arrivillaga Jul 16 '19 at 19:10
  • This is python 2 and windows. I think that the map(f, xs) going quickly is standard, and more of an example to show that I do actually expect this to run quickly, which makes me surprised that the paralleled version does not. – Everyone_Else Jul 16 '19 at 19:36
  • 1
    Ah, then almost certainly you are creating a multiprocessing bomb. you **must** guard the multiprocessing step with `if __name__ == "__main__"` because windows does not fork, it creates a new process and runs it again, thus you keep creating more and more processes without stopping – juanpa.arrivillaga Jul 16 '19 at 19:38

0 Answers0