Problem description
TL;DR
How to implement decorators not breaking the pickle process?
Goal
multiprocessing.Pool
can be used to chunk data and distribute it to processes for data-parallelising a given function. I'd like to use such an approach within a decorator for an user-friendly data-parallelisation. The decorator would typically look like the following:
from multiprocessing import Pool
from functools import partial, wraps
def deco_data_parallel(func):
@wraps(func)
def to_parallel(arg, **kwargs):
part_func = partial(func, **kwargs)
tot = 0
with Pool() as p:
for output in p.imap_unordered(part_func, arg):
tot += output
return tot
return to_parallel
The above implementation imposes the following conditions on the function to be parallelized. These limitations can very likely be overcome with a better design.
arg
is an iterable to be split into chunks- The fix arguments must be called as keyword arguments
Here is an example of the intended use:
@deco_data_parallel
def compute(data, arg1, arg2):
return data + arg1 + arg2
if __name__ == "__main__":
# Dummy data
data = [4]*100000
# Fix arguments must be used as keyword arguments
compute(data, arg1=1, arg2=2)
Error
The function fed to imap_unordered
must be pickable.
The decorator seems to break the pickability of the original function:
_pickle.PicklingError: Can't pickle <function compute at 0x1040137a0>: it's not the same object as __main__.compute
Best solution
I first thought that @wraps
was the problem: if the decorated function is identical to the original function, then the latter can't be found by the pools. But it turns out that the @wraps
decorator doesn't have any effect.
Thanks to that great post, I could come up with the following non-optimal solution: create a new top-level function object by using the decorator explicitly as follows. It partially breaks the user-friendliness, and is therefore not satisfying. It nevertheless fulfils the expected purpose.
# Beware: the names should not be the same
compute_ = deco_data_parallel(compute)
if __name__ == "__main__":
...
compute_(data, arg1=1, arg2=2)
Questions
- How to solve the pickability problem in an elegant way so that the user can simply decorate the function to be parallelised?
- Why doesn't the
@functools.wraps
decorator have any effect? - What does the error
it's not the same object as __main__.compute
actually mean? I.e. in what sense exactly am I breaking the pickling process?
My configuration
Macport's Python 3.7.7 on OSX 10.14.6
Disclaimer
I'm quite new to the world of parallel computing in python, as well as in the world of python decorators. Heresy is very likely to have happened in this post and I apologise for that!
That's also my second question on StackOverflow, any proposition of improvement is welcome.
Detailed investigation
Convinced to make it work, I tried multiple decorating strategies. This very complete post on the decorators matter was a great guide. This blog post gave me some hope that an object-oriented decorating strategy could make things work: the author indeed claims that it fixed his/her pickability problem.
All of the following approaches have been tested, with and without @wraps
, and they all lead to the same _pickle.PicklingError
error. I begin to have the feeling that I tried all the non-hacky possibilities python has to offer, and that would be a great pleasure to be proven wrong!
Functional approach
The most simple approach is the one I showed above. For decorators with arguments, a "decorator factory" can be used as well. Let's use the number of processors here for the sake of the example.
def factory_data_parallel(nproc=4):
def deco_data_parallel(func):
@wraps(func)
def to_parallel(arg, **kwargs):
part_func = partial(func, **kwargs)
tot = 0
with Pool(nproc) as p:
for output in p.imap_unordered(part_func, arg):
tot += output
return tot
return to_parallel
return deco_data_parallel
# Usage: only with the argument (or parenthesis at least)
@factory_data_parallel(8)
def compute(data, arg1, arg2):
...
A hybrid form which can be used as a simple decorator or a decorator factory would be implemented as follows:
def factorydeco_data_parallel(_func=None, *, nproc=4):
def deco_data_parallel(func):
...
if _func is None:
return deco_data_parallel
else:
return deco_data_parallel(_func)
# Usage as a factory (with argument)
@factorydeco_data_parallel(8)
def compute(data, arg1, arg2):
...
# Usage as a simple decorator
@factorydeco_data_parallel
def other_compute(data, arg1, arg2):
...
Object-oriented approach
From my understanding, a decorator can be any callable object. A simple decorator using an object can be implemented as follow. The first version is called with parenthesis (explicit creation of the object when decorating), and the second one is used as a standard decorator.
class Class_data_parallel(object):
def __call__(self, func):
self.orig_func = func
@wraps(func)
def to_parallel(arg, **kwargs):
# Does it make a difference to use the argument func instead?
part_func = partial(self.orig_func, **kwargs)
tot = 0
with Pool() as p:
for output in p.imap_unordered(part_func, arg):
tot += output
return tot
return to_parallel
class Class_data_parallel_alt(object):
def __init__(self, func):
self.orig_func = func
# PB: no way I'm aware of to use @wraps
def __call__(self, arg, **kwargs):
part_func = partial(self.orig_func, **kwargs)
tot = 0
with Pool() as p:
for output in p.imap_unordered(part_func, arg):
tot += output
return tot
# Usage: with parenthesis
@Class_data_parallel()
def compute(data, arg1, arg2):
...
# Usage: without parenthesis
@Class_data_parallel_alt
def other_compute(data, arg1, arg2):
...
An obvious extension of the first case could enable to add some parameters to the constructor. The class would then play the role of a decorator factory.
Some more thinking
- As I mentioned,
@wraps
was candidate for both being the cause and the solution to the problem. Using it or not doesn't change anything - The use of
parallel
for handling the constant arguments (i.e. constant across processes,arg1
andarg2
in my examples) could be a problem, but I doubt it. I could use theinitializer
argument of thePool()
constructor. - A. Sherman and P. Den Hartog did achieve that goal in their DECO parallel model. I'm however not able to understand how they overcame my problem. It seems to prove that what I want to do is not a fundamental limitation of decorators.