0

I am trying to write a Python function for fast calculation of the sum of a list, using parallel computing. Initially I tried to use the Python multithreading library, but then I noticed that all threads run on the same CPU, so there is no speed gain, so I switched to using multiprocessing. In the first version I made the list a global variable:

from multiprocessing import Pool

array = 100000000*[1]

def sumPart(fromTo:tuple):
    return sum(array[fromTo[0]:fromTo[1]])

with Pool(2) as pool:
    print(sum(pool.map(sumPart, [(0,len(array)//2), (len(array)//2,len(array))])))

This worked well and returned the correct sum after about half the time of a serial computation.

But then I wanted to make it a function that accepts the array as an argument:

def parallelSum(theArray):
    def sumPartLocal(fromTo: tuple):
        return sum(theArray[fromTo[0]:fromTo[1]])
    with Pool(2) as pool:
        return (sum(pool.map(sumPartLocal, [(0, len(theArray) // 2), (len(theArray) // 2, len(theArray))])))

Here I got an error:

AttributeError: Can't pickle local object 'parallelSum.<locals>.sumPartLocal'

What is the correct way to write this function?

Erel Segal-Halevi
  • 33,955
  • 36
  • 114
  • 183

2 Answers2

2

When scheduling jobs to a Python Pool you need to ensure both the function and it's arguments can be serialized as they will be transferred over a pipe.

Python uses the pickle protocol to serialize its objects. You can see what can be pickled in the module documentation. In your case, you are facing this limitation.

functions defined at the top level of a module (using def, not lambda)

Under the hood, the Pool is sending a string with the function name and its parameters. The Python interpreter in the child process looks for that function name in the module and fails to find it as it's nested in the scope of another function parallelSum.

Move sumPartLocal outside parallelSum and everything will be fine.

noxdafox
  • 14,439
  • 4
  • 33
  • 45
1

I believe you are hitting this, or see the documentation

What you could do is leave def sumPartLocal at module level, and pass theArray as third component of your tuple so that would be fromTo[2] inside the sumPartLocal function.

Example:

from multiprocessing import Pool

def sumPartLocal(fromTo: tuple):
    return sum(fromTo[2][fromTo[0]:fromTo[1]]) 

def parallelSum(theArray):
    with Pool(2) as pool:
        return (sum
                (pool.map
                 (sumPartLocal, [
                     (0, len(theArray) // 2, theArray), 
                     (len(theArray) // 2, len(theArray), theArray)
                     ]
                  )
                 )
                )

if __name__ == '__main__':
    theArray = 100000000*[1]
    s = parallelSum(theArray)
    print(s)

[EDIT 15-Dec-2017 based on comments]

Anyone who is thinking of multi-threading in python, I strongly recommend reading up about the Global Interpreter Lock

Also, some good answers on this question here on SO

Edwin van Mierlo
  • 2,398
  • 1
  • 10
  • 19
  • This returns the correct result. However, it takes twice the time of the global-variable version, and almost as much time as the serial version. For the experiment, instead of just a sum, I did sum([x**3 for x in array]). I took array = 50000000*[1]. On my computer, the serial version takes about 10.8, the parallel version with a global array takes 6.0 seconds, and the parallel version with a function takes 8.7seconds... – Erel Segal-Halevi Dec 15 '17 at 09:53
  • 1
    I think I know why my program was slow - it was serializing and de-serializing the entire huge array. I did an experiment where instead of an array I calculate the sum of a function: sum([math.sin(x)**3 for x in range(0,20000000)]). Here, the multiprocessing version indeed takes about half the serial version. – Erel Segal-Halevi Dec 16 '17 at 22:58