2

Using multiprocessing on methods we get error stating self.methods cannot be pickled.

To overcome this I used:

def _pickle_method(m):
    if m.im_self is None:
        return getattr, (m.im_class, m.im_func.func_name)
    else:
        return getattr, (m.im_self, m.im_func.func_name)


copy_reg.pickle(types.MethodType, _pickle_method)

Now I searched about this, but few questions are still unclear:

  1. Why can't self.methods be pickled?
  2. How does copy_reg.pickle() work? How does it enable pickling of self.methods?
  3. Are there any negatives of using this approach or are there any other, better, methods?

More Info:

I had a function in a class which used to do a request.get and request.post. To improve times , I used multiprocessing over it. This is the exact problem I faced.

Can't pickle <type 'instancemethod'> when using multiprocessing Pool.map()

martineau
  • 119,623
  • 25
  • 170
  • 301
vks
  • 67,027
  • 10
  • 91
  • 124
  • Without more information, it's difficult to understand _why_ `pickle` would be trying to serialize an actual class method (that's not how `pickle` does things)—so it's possible answer is "If you need to do this, you're doing something else wrong." Please [edit] your question and add more context. – martineau Nov 22 '16 at 16:41
  • What exactly is `self.methods`? If possible, please show the class with the method that does the `request.get` and `request.post` (and the related multiprocessing). – martineau Nov 24 '16 at 19:43
  • @martineau i just used the name self.mathods....they are just any methods of class which take self......the code can be found in the linked SO post...its same or similar – vks Nov 24 '16 at 19:45

1 Answers1

3

Your intended usage is still a little vague—so I will instead show a way to solve the linked question in the new "More info" section of your question that doesn't require the pickling the byte-code of a class method. I think how it does it is fairly obvious...

someclass.py

import multiprocessing

def call_method(x):
    return SomeClass().f(x)

class SomeClass(object):
    def __init__(self):
        pass

    def f(self, x):
        return x*x

    def go(self):
        pool = multiprocessing.Pool(processes=4)
        print(pool.map(call_method,  range(10)))
        pool.close()

test.py

import someclass

if __name__== '__main__' :
    sc = someclass.SomeClass()
    sc.go()

Output:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
martineau
  • 119,623
  • 25
  • 170
  • 301
  • `pickle` doesn't serialize code. It stores enough information about any that needs to be save so it can be imported when needed. The `copy_reg()` function is described in the [online documentation](https://docs.python.org/2/library/copy_reg.html). It's a way to associate (register) a function with a particular data type. This function will be called when an attempt is made to serialize any object encountered of the type specified. – martineau Nov 23 '16 at 09:15
  • ohh ok....that clear up things...just one questions remains....why cant multiprocess pickle self.methods whereas it can when using normal functions.The SO link in question does explain but can you throw more light over it.M still not sure about that part. – vks Nov 23 '16 at 09:18
  • @vks: I'm not sure what you mean by it can with "normal functions". There's not need to pickle built-in functions because they're always available. – martineau Jul 04 '17 at 15:15