22

I have to pickle an array of objects like this:

import cPickle as pickle
from numpy import sin, cos, array
tmp = lambda x: sin(x)+cos(x)
test = array([[tmp,tmp],[tmp,tmp]],dtype=object)
pickle.dump( test, open('test.lambda','w') )

and it gives the following error:

TypeError: can't pickle function objects

Is there a way around that?

Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
  • Seems like a strange thing to do. What's the use-case? – Aya May 18 '13 at 16:23
  • @Aya lambdify in SymPy makes it very convenient to create lambda functions. And I want to evaluate them using Cython. You can [refer to this other question for further information](http://stackoverflow.com/questions/16295140/numerical-integration-over-a-matrix-of-functions-sympy-and-scipy) – Saullo G. P. Castro May 18 '13 at 16:25
  • 1
    Well, I don't know much about Cython, but Martijn's solution will only work if it's possible for Cython to import the Python file in which the `tmp(x)` function was defined. – Aya May 18 '13 at 16:31

3 Answers3

23

The built-in pickle module is unable to serialize several kinds of python objects (including lambda functions, nested functions, and functions defined at the command line).

The picloud package includes a more robust pickler, that can pickle lambda functions.

from pickle import dumps
f = lambda x: x * 5
dumps(f) # error
from cloud.serialization.cloudpickle import dumps
dumps(f) # works

PiCloud-serialized objects can be de-serialized using the normal pickle/cPickle load and loads functions.

Dill also provides similar functionality

>>> import dill           
>>> f = lambda x: x * 5
>>> dill.dumps(f)
'\x80\x02cdill.dill\n_create_function\nq\x00(cdill.dill\n_unmarshal\nq\x01Uec\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x08\x00\x00\x00|\x00\x00d\x01\x00\x14S(\x02\x00\x00\x00Ni\x05\x00\x00\x00(\x00\x00\x00\x00(\x01\x00\x00\x00t\x01\x00\x00\x00x(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00<lambda>\x01\x00\x00\x00s\x00\x00\x00\x00q\x02\x85q\x03Rq\x04c__builtin__\n__main__\nU\x08<lambda>q\x05NN}q\x06tq\x07Rq\x08.'
Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
ChrisB
  • 4,628
  • 7
  • 29
  • 41
  • thank you! with picloud package it worked! The Dill I did not test yet... The created pickle can be loaded using the conventional pickle or cPickle modules – Saullo G. P. Castro May 18 '13 at 17:08
  • 1
    Is there any way to use this pickler with the multiprocessing library? – John Salvatier Dec 05 '13 at 08:16
  • The answer is: sort of http://stackoverflow.com/questions/19984152/what-can-multiprocessing-and-dill-do-together – John Salvatier Dec 05 '13 at 08:30
  • 1
    For clarity, `dill` can be used with a fork of `multiprocessing`. The `picloud` serializer can't as of yet, and I believe is currently unsupported or has been bought out or something. http://blog.picloud.com/2013/11/17/picloud-has-joined-dropbox/ – Mike McKerns Jun 11 '14 at 21:56
9

You'll have to use an actual function instead, one that is importable (not nested inside another function):

import cPickle as pickle
from numpy import sin, cos, array
def tmp(x):
    return sin(x)+cos(x)
test = array([[tmp,tmp],[tmp,tmp]],dtype=object)
pickle.dump( test, open('test.lambda','w') )

The function object could still be produced by a lambda expression, but only if you subsequently give the resulting function object the same name:

tmp = lambda x: sin(x)+cos(x)
tmp.__name__ = 'tmp'
test = array([[tmp, tmp], [tmp, tmp]], dtype=object)

because pickle stores only the module and name for a function object; in the above example, tmp.__module__ and tmp.__name__ now point right back at the location where the same object can be found again when unpickling.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • I guess that kind of answer can’t be used on built‑in functions of C based modules *(even if the ᴏꜱ and architecture stay the same)*. – user2284570 Aug 19 '16 at 21:05
  • @user2284570: pickle has specific facilities for storing references to C structures. However, to 'pickle' a function, all that is stored is a set of strings (module plus name within the module) that are dereferenced when unpickling again. – Martijn Pieters Aug 19 '16 at 22:13
  • So do you mean it’s possible to save but not to restore something executable ? I’m only interested in restoring *(creating the dump doesn’t have to be done in python)*. I don’t care what is used *(marshal or cPickle)* as long as no third party modules are used with the exeception of numpy. – user2284570 Aug 19 '16 at 23:26
  • @user2284570: yes, it is possible to save a reference to a function that no longer is available when you try to restore. – Martijn Pieters Aug 20 '16 at 08:12
  • By reference, do you mean the pythonic function name or the function address ? I’m interested in restoring the function native ᴄᴘᴜ code on a platform where dynamic linking isn’t available. If only restoring it’s reference is possible, I guess the answer is no. – user2284570 Aug 20 '16 at 10:07
  • @user2284570: functions are stored by name (plus the module). Classes are referenced in the same manner (so instances are stored as pickled `__dict__` data plus a name for the class including the module). – Martijn Pieters Aug 20 '16 at 10:09
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/121419/discussion-between-user2284570-and-martijn-pieters). – user2284570 Aug 20 '16 at 10:12
5

There is another solution: define you functions as strings, pickle/un-pickle then use eval, ex:

import cPickle as pickle
from numpy import sin, cos, array
tmp = "lambda x: sin(x)+cos(x)"
test = array([[tmp,tmp],[tmp,tmp]],dtype=object)
pickle.dump( test, open('test.lambda','w') )
mytmp = array([[eval(x) for x in l] for l in pickle.load(open('test.lambda','r'))])
print mytmp
# yields : [[<function <lambda> at 0x00000000033D4DD8>
#            <function <lambda> at 0x00000000033D4E48>]
#           [<function <lambda> at 0x00000000033D4EB8>
#            <function <lambda> at 0x00000000033D4F28>]]

This could be more convenient for other solutions because the pickled representation would be completely self contained without having to use external dependencies.

Rabih Kodeih
  • 9,361
  • 11
  • 47
  • 55