2

I am using multiprocessing.Pool to speed up computation, as I call one function multiple times, and then collate the result. Here is a snippet of my code:

import multiprocessing
from functools import partial

def Foo(id:int,constant_arg1:str, constant_arg2:str):
    custom_class_obj = CustomClass(constant_arg1, constant_arg2)
    custom_class_obj.run() # this changes some attributes of the custom_class_obj
    
    if(something):
       return None
    else:
       return [custom_class_obj]



def parallel_run(iters:int, a:str, b:str):
  pool = multiprocessing.Pool(processes=k)

  ## create the partial function obj before passing it to pool
  partial_func = partial(Foo, constant_arg1=a, constant_arg2=b)

  ## create the variable id list
  iter_list = list(range(iters))
  all_runs = pool.map(partial_func, iter_list)
 
  return all_runs

This throws the following error in the multiprocessing module:

multiprocessing.pool.MaybeEncodingError: Error sending result: '[[<CustomClass object at 0x1693c7070>], [<CustomClass object at 0x1693b88e0>], ....]'
Reason: 'TypeError("cannot pickle 'module' object")'

How can I resolve this?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
204
  • 433
  • 1
  • 5
  • 19
  • 1
    You'd need to make your custom class pickleable. That error however suggests that you're trying to return the *module*, not a custom class. – Carcigenicate Jul 10 '21 at 14:58
  • I am returning a CustomClass object (as seen in the list shown after 'result' in the error message). But, is there a way to use Pool for classes that are not pickleable? – 204 Jul 10 '21 at 15:41
  • 3
    You're going to have to post your `CustomClass`. See [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example). – Booboo Jul 17 '21 at 11:57

1 Answers1

2

I was able to replicate the error message with a minimal example of an un-picklable class. The error basically states the instance of your class can't be pickled because it contains a reference to a module, and modules are not picklable. You need to comb through CustomClass to make sure instances don't hold things like open file handles, module references, etc.. If you need to have those things, you should use __getstate__ and __setstate__ to customize the pickle and unpickle process.

distilled example of your error:

from multiprocessing import Pool
from functools import partial

class klass:
    def __init__(self, a):
        self.value = a
        import os
        self.module = os #this fails: can't pickle a module and send it back to main process

def foo(a, b, c):
    return klass(a+b+c)

if __name__ == "__main__":
    with Pool() as p:
        a = 1
        b = 2
        bar = partial(foo, a, b)
        res = p.map(bar, range(10))
    print([r.value for r in res])
Aaron
  • 10,133
  • 1
  • 24
  • 40
  • Thanks for the answer! Yes, my class does have module references. In fact it explicitly contains a 'Program' class object which internally contains a python program file. But setting __getstate__ and __setstate__ functions for the class is proving to be tricky. Is there an alternate work around solution using 'dill' serializer? – 204 Jul 20 '21 at 16:39
  • I've not done this, but a cheap hack could possibly be to call `dill` to serialize `self` inside `__getstate__` and deserialize inside `__setstate__`. There's also the `pathos` library which I have also not worked with, but I'm aware it uses `dill` for some things – Aaron Jul 20 '21 at 16:45
  • maybe also [this](https://stackoverflow.com/a/40245339/3220135) – Aaron Jul 20 '21 at 16:49