5

I need to boost my python application. The solution was supposed to be trivial:

import time
from multiprocessing import Pool


class A:
    def method1(self):
        time.sleep(1)
        print('method1')
        return 'method1'

    def method2(self):
        time.sleep(1)
        print('method2')
        return 'method2'

    def method3(self):
        pool = Pool()
        time1 = time.time()
        res1 = pool.apply_async(self.method1, [])
        res2 = pool.apply_async(self.method2, [])
        res1 = res1.get()
        res2 = res2.get()
        time2 = time.time()
        print('res1 = {0}'.format(res1))
        print('res2 = {0}'.format(res2))
        print('time = {0}'.format(time2 - time1))



a = A()
a.method3()

But every time I launch this simple program I get an exception:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python3.2/threading.py", line 740, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.2/threading.py", line 693, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.2/multiprocessing/pool.py", line 346, in _handle_tasks
    put(task)
_pickle.PicklingError: Can't pickle <class 'method'>: attribute lookup builtins.method failed

It seems python can't parallelize class methods. I have been desperately googling the whole day but I still don't understand how to work it around (possibly I was googling not enough good). Python multiprocessing documentation seems to be very poor.

I don't want to destroy my class separating it on global methods. The following code seems to be workable:

class B:
    def method1(self):
        time.sleep(1)
        print "B.method1"
        return "B.method1"

    def method2(self):
        time.sleep(1)
        print "B.method2"
        return "B.method2"

def method1(b):
    time.sleep(1)
    print('method1')
    return  b.method1()


def method2(b):
    time.sleep(1)
    print('method2')
    return  b.method2()


def method3():
    pool = Pool()
    time1 = time.time()
    b = B()
    res1 = pool.apply_async(method1, [b])
    res2 = pool.apply_async(method2, [b])
    res1 = res1.get()
    res2 = res2.get()
    time2 = time.time()
    print('res1 = {0}'.format(res1))
    print('res2 = {0}'.format(res2))
    print('time = {0}'.format(time2 - time1))

method3()

But I still don't understand how to make python parallelization work inside a class method. Could someone help me to work around this obstacle? Possibly there are other ways of parallelization I don't know that can be applied in this case? I need a workable code as an example. Any help would be greatly appreciated.

Lucky Man
  • 1,488
  • 3
  • 19
  • 41
  • 1
    I might be dumb but what's the point of parallelizing instance methods anyway ??? You certainly don't want your processes to share the same mutable state (the instance's state), and if you don't have a mutable state to share all you need is either a plain function or eventually a class with classmethods / staticmethods only (if you have a use for type-based dispatch). – bruno desthuilliers May 11 '15 at 11:31
  • Your first example works fine when I run it with Python 2.7.6, on an Ubuntu machine. – Emile May 11 '15 at 11:42
  • 1
    Multiprocessing relies on pickling to pass functions around, the object being pickled must be capable of being referred to in the global context for the unpickle to be able to access it, hence your instance methods not being allowed. I can't see a problem with your second example, unless you can give a reason to use a class why is one needed? – gonkan May 11 '15 at 11:46
  • bruno desthuilliers, You are absolutely right. But it is just a simplified example. My real class is much more bigger. In my case sharing of clases' internals is OK because methods I want to parallelize don't modify it's state. They only analyze some fields, arrays (a lot of things) and return result. It is quite big but solid class (about 500 lines) which is used in other module. I think it is better to encapsulate parallelizing in it not changing it's interface. But of course if it isn't possible, I will do it. – Lucky Man May 11 '15 at 12:00
  • Emile, Sounds great, I have a Python 2.7.3. I'll try to install a new version. – Lucky Man May 11 '15 at 12:02
  • 2
    @LuckyMan: if all you need is read-access to the instance's state, you can explicitely pass it to plain functions don't you ? Also remember that you can still make these functions methods of the class just by making them attributes of the class too. – bruno desthuilliers May 11 '15 at 12:20
  • gonkan, I am a newbie in Python parallelization. I have no idea what pickling is. It was quite surprising to see such exception :) I just wanted to boost my class' method without changing anything else. – Lucky Man May 11 '15 at 12:22
  • bruno desthuilliers, You mean my second example? Yes I can do it. As for atributes I should investigate it. Thanks for tips. – Lucky Man May 11 '15 at 12:26
  • Emile, Sorry the first example was a bit wrong. I have already modified it. Python3.2 still prints pickling error. – Lucky Man May 11 '15 at 12:35
  • @Lucky Man - https://docs.python.org/3/library/pickle.html, it's just a way of serialising an object in python, it's not specific to Multiprocessing. – gonkan May 11 '15 at 14:00
  • Does this answer your question? [Can't pickle when using multiprocessing Pool.map()](https://stackoverflow.com/questions/1816958/cant-pickle-type-instancemethod-when-using-multiprocessing-pool-map) – Abdulrahman Bres Jul 06 '21 at 10:41

0 Answers0