0

I'm doing some multiprocessing in Python, using classes, and to do such thing I had to use this approach:

def _pickle_method(method):
    func_name = method.im_func.__name__
    obj = method.im_self
    cls = method.im_class
    if func_name.startswith('__') and not func_name.endswith('__'): #deal with mangled names
        cls_name = cls.__name__.lstrip('_')
        func_name = '_' + cls_name + func_name
    print cls
    return _unpickle_method, (func_name, obj, cls)


def _unpickle_method(func_name, obj, cls):
    for cls in cls.__mro__:
        try:
            func = cls.__dict__[func_name]
        except KeyError:
            pass
        else:
            break
    return func.__get__(obj, cls)

The problem is 'cause I have some static methods that should parallelized too. But I found that with this I can't pickle static methods. I'm wondering it there is a way to change this methods to do such thing, so I could pickle both non-static and static methods.

Thank you in advance.

Community
  • 1
  • 1
pceccon
  • 9,379
  • 26
  • 82
  • 158
  • 1
    Why do you need this approach? Are your class methods dynamically generated? – Martijn Pieters Jan 14 '14 at 12:20
  • This was the only way I found to multiprocessing a class method. – pceccon Jan 14 '14 at 12:31
  • Why not pass a regular function to the pool instead, and in that function call the methods? – Martijn Pieters Jan 14 '14 at 12:32
  • But this is what I'm trying to do. Didn't understanding what you mean. – pceccon Jan 14 '14 at 12:38
  • Instead of `pool.apply_async(ImageData.shepard_interpolation, args=[ImageData()])`, create a separate function `def apply_shepard_interpolation(img): img.shepard_interpolation()` and pass that to the pool with `pool.apply_async(apply_shepard_interpolation, args=[ImageData()])`. – Martijn Pieters Jan 14 '14 at 12:40

1 Answers1

2

I'm not really sure what you were trying to do… but if you want to work with classes and multiprocessing, it's going to be ugly unless you jump outside the standard library.

If you use a fork of multiprocessing called pathos.multiprocesssing, you can directly use classes and class methods in multiprocessing's map functions. This is because dill is used instead of pickle or cPickle, and dill can serialize almost anything in python.

pathos.multiprocessing also provides an asynchronous map function… and it can map functions with multiple arguments (e.g. map(math.pow, [1,2,3], [4,5,6]))

See: What can multiprocessing and dill do together?

and: http://matthewrocklin.com/blog/work/2013/12/05/Parallelism-and-Serialization/

>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> 
>>> p = Pool(4)
>>> 
>>> def add(x,y):
...   return x+y
... 
>>> x = [0,1,2,3]
>>> y = [4,5,6,7]
>>> 
>>> p.map(add, x, y)
[4, 6, 8, 10]
>>> 
>>> class Test(object):
...   def plus(self, x, y): 
...     return x+y
... 
>>> t = Test()
>>> 
>>> p.map(Test.plus, [t]*4, x, y)
[4, 6, 8, 10]
>>> 
>>> p.map(t.plus, x, y)
[4, 6, 8, 10]

Get the code here: https://github.com/uqfoundation/pathos

Community
  • 1
  • 1
Mike McKerns
  • 33,715
  • 8
  • 119
  • 139