So pickle
is very limited in what it can serialize. The full list is pretty much given in the docs.. here:
https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled
and here:
https://docs.python.org/2/library/pickle.html#what-can-be-pickled-and-unpickled.
It gets worse. Pickling doesn't really work in the interpreter, mainly because pickle
primarily serializes by reference. It doesn't actually pickle the function or the class object, it serializes a string that is essentially their name:
>>> import pickle
>>> import math
>>> pickle.dumps(math.sin)
'cmath\nsin\np0\n.'
So, if you have built your function, class, or whatever in the interpreter, then you essentially can't pickle the object with pickle
. It looks for the __main__
module, and pickle
can't find __main__
. This is also why things fail to serialize with multiprocessing
in the interpreter.
However, there is a good solution. You could use a better serializer (like dill
), and a fork of multiprocessing
that leverages a better serializer.
>>> import dill
>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> p = Pool()
>>>
>>> def squared(x):
... return x**2
...
>>> dill.dumps(squared)
'\x80\x02cdill.dill\n_create_function\nq\x00(cdill.dill\n_unmarshal\nq\x01Ufc\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x08\x00\x00\x00|\x00\x00d\x01\x00\x13S(\x02\x00\x00\x00Ni\x02\x00\x00\x00(\x00\x00\x00\x00(\x01\x00\x00\x00t\x01\x00\x00\x00x(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x07\x00\x00\x00squared\x01\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x02\x85q\x03Rq\x04c__builtin__\n__main__\nU\x07squaredq\x05NN}q\x06tq\x07Rq\x08.'
>>>
>>> p.map(squared, range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>>
There's a decent list of sorts for what can serialize and what can't here:
https://github.com/uqfoundation/dill/blob/master/dill/_objects.py
-- it's not comprehensive, but most things can be serialized with dill
.
Get pathos
and dill
here: https://github.com/uqfoundation