137

I'm trying to transfer a function across a network connection (using asyncore). Is there an easy way to serialize a python function (one that, in this case at least, will have no side effects) for transfer like this?

I would ideally like to have a pair of functions similar to these:

def transmit(func):
    obj = pickle.dumps(func)
    [send obj across the network]

def receive():
    [receive obj from the network]
    func = pickle.loads(s)
    func()
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
Michael Fairley
  • 12,980
  • 4
  • 26
  • 23

12 Answers12

145

You could serialise the function bytecode and then reconstruct it on the caller. The marshal module can be used to serialise code objects, which can then be reassembled into a function. ie:

import marshal
def foo(x): return x*x
code_string = marshal.dumps(foo.__code__)

Then in the remote process (after transferring code_string):

import marshal, types

code = marshal.loads(code_string)
func = types.FunctionType(code, globals(), "some_func_name")

func(10)  # gives 100

A few caveats:

  • marshal's format (any python bytecode for that matter) may not be compatable between major python versions.

  • Will only work for cpython implementation.

  • If the function references globals (including imported modules, other functions etc) that you need to pick up, you'll need to serialise these too, or recreate them on the remote side. My example just gives it the remote process's global namespace.

  • You'll probably need to do a bit more to support more complex cases, like closures or generator functions.

m01
  • 9,033
  • 6
  • 32
  • 58
Brian
  • 116,865
  • 28
  • 107
  • 112
  • 1
    In Python 2.5, the "new" module is deprecated. 'new.function' should be replaced by 'types.FunctionType', after an "import types", I believe. – Eric O. Lebigot Aug 10 '09 at 10:31
  • @EOL: Good point - I've updated the code to use the types module instead. – Brian Aug 10 '09 at 11:26
  • 3
    Thanks. This is exactly what I was looking for. Based on some cursory testing, it works as is for generators. – Michael Fairley Aug 10 '09 at 17:47
  • 2
    If you read the first couple of paragraphs on the marshal module you see it strongly suggests using pickle instead? Same for the pickle page. http://docs.python.org/2/library/marshal.html – dgorissen Feb 25 '13 at 09:48
  • 1
    I am trying to apply the `marshal` module to serialize a dictionary of dictionaries initialized as `defaultdict(lambda : defaultdict(int))`. But it returns the error `ValueError: unmarshallable object`. Note I'am usin python2.7. Any idea? Thanks – user17375 May 08 '13 at 04:56
  • 1
    From the docs: "Python has a more primitive serialization module called marshal, but in general pickle should always be the preferred way to serialize Python objects. marshal exists primarily to support Python’s .pyc files." https://docs.python.org/3/library/pickle.html#relationship-to-other-python-modules – mgoldwasser Jun 21 '16 at 21:30
  • 1
    @mgoldwasser: Yes, but pickle does not support pickling code objects, which is what the OP is asking for. – Brian Aug 10 '16 at 10:56
  • 4
    On Python 3.5.3, `foo.func_code` raises `AttributeError`. Is there another way to get the function code? – AlQuemist Nov 18 '18 at 19:42
  • 1
    @AlQuemist `marshal.dumps(foo.__code__)` work for me – Bllli Apr 02 '19 at 09:03
  • 1
    Another question has an answer that goes further this same approach, see here: https://stackoverflow.com/a/6234683/47551 – xgMz Apr 30 '19 at 21:19
  • @AlQuemist: Try the following: ```import inspect code = inspect.getsource(func)``` – Aymen Alsaadi Dec 03 '20 at 16:47
  • 1
    @xgMz was just going to complain that the unmarshaled function forgot my default arguments, but looks like that other approach you linked contains that and much more. – ijoseph Jan 20 '22 at 04:49
59

Check out Dill, which extends Python's pickle library to support a greater variety of types, including functions:

>>> import dill as pickle
>>> def f(x): return x + 1
...
>>> g = pickle.dumps(f)
>>> f(1)
2
>>> pickle.loads(g)(1)
2

It also supports references to objects in the function's closure:

>>> def plusTwo(x): return f(f(x))
...
>>> pickle.loads(pickle.dumps(plusTwo))(1)
3
Josh Rosen
  • 13,511
  • 6
  • 58
  • 70
13

The most simple way is probably inspect.getsource(object) (see the inspect module) which returns a String with the source code for a function or a method.

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
  • This looks good, except that the function name is explicitly defined in the code, which is slightly problematic. I could strip the first line of the code off, but that's breakable by doing something like 'def \/n func():'. I could pickle the name of the function with the function itself, but I'd have no guarantees that the name wouldn't collide, or I'd have to put the function in a wrapper, which is still not the cleanest solution, but it might have to do. – Michael Fairley Aug 10 '09 at 08:01
  • 1
    Note that the inspect module is actually just asking the function where it was defined, and then reading in those lines from the source code file - hardly sophisticated. – too much php Aug 10 '09 at 09:05
  • 1
    You can find out the function's name using its .__name__ attribute. You could do a regex replace on ^def\s*{name}\s*( and give it whatever name you like. It's not foolproof, but it will work for most things. – too much php Aug 10 '09 at 09:09
13

Pyro is able to do this for you.

voithos
  • 68,482
  • 12
  • 101
  • 116
RichieHindle
  • 272,464
  • 47
  • 358
  • 399
  • I'd need to stick with the standard library for this particular project. – Michael Fairley Aug 10 '09 at 07:48
  • 24
    But that doesn't mean you can't *look* at the code of Pyro to see how it is done :) – Aaron Digulla Aug 10 '09 at 09:45
  • 6
    @AaronDigulla- true, but it's worth mentioning that before reading a single line of someone else's published code, you should always check the software's license. Reading someone else's code and reusing the ideas without citing the source or adhering to license/copying constraints could be considered plagiarism and/or copyright violation in many cases. – mdscruggs Aug 14 '13 at 14:07
9

It all depends on whether you generate the function at runtime or not:

If you do - inspect.getsource(object) won't work for dynamically generated functions as it gets object's source from .py file, so only functions defined before execution can be retrieved as source.

And if your functions are placed in files anyway, why not give receiver access to them and only pass around module and function names.

The only solution for dynamically created functions that I can think of is to construct function as a string before transmission, transmit source, and then eval() it on the receiver side.

Edit: the marshal solution looks also pretty smart, didn't know you can serialize something other thatn built-ins

kurczak
  • 1,521
  • 1
  • 10
  • 18
6

In modern Python you can pickle functions, and many variants. Consider this

import pickle, time
def foobar(a,b):
    print("%r %r"%(a,b))

you can pickle it

p = pickle.dumps(foobar)
q = pickle.loads(p)
q(2,3)

you can pickle closures

import functools
foobar_closed = functools.partial(foobar,'locked')
p = pickle.dumps(foobar_closed)
q = pickle.loads(p)
q(2)

even if the closure uses a local variable

def closer():
    z = time.time()
    return functools.partial(foobar,z)
p = pickle.dumps(closer())
q = pickle.loads(p)
q(2)

but if you close it using an internal function, it will fail

def builder():
    z = 'internal'
    def mypartial(b):
        return foobar(z,b)
    return mypartial
p = pickle.dumps(builder())
q = pickle.loads(p)
q(2)

with error

pickle.PicklingError: Can't pickle <function mypartial at 0x7f3b6c885a50>: it's not found as __ main __.mypartial

Tested with Python 2.7 and 3.6

am70
  • 591
  • 6
  • 8
  • 3
    Just note, pickling does not actually serializer all the code. This is just serialising a reference to the function. Which means you can only run it if the function exists in the future. Think of the case where you want to replay code at a point in time (not that you would), but this would not allow it - you couldn't store the pickled function and call it in a moved on code base if that function no longer exists or doesn't exist in the same form. – Trent Apr 04 '22 at 23:54
5

The cloud package (pip install cloud) can pickle arbitrary code, including dependencies. See https://stackoverflow.com/a/16891169/1264797.

Community
  • 1
  • 1
stevegt
  • 1,644
  • 20
  • 26
3
code_string = '''
def foo(x):
    return x * 2
def bar(x):
    return x ** 2
'''

obj = pickle.dumps(code_string)

Now

exec(pickle.loads(obj))

foo(1)
> 2
bar(3)
> 9
Yanni Papadakis
  • 135
  • 1
  • 3
3

Cloudpickle is probably what you are looking for. Cloudpickle is described as follows:

cloudpickle is especially useful for cluster computing where Python code is shipped over the network to execute on remote hosts, possibly close to the data.

Usage example:

def add_one(n):
  return n + 1

pickled_function = cloudpickle.dumps(add_one)
pickle.loads(pickled_function)(42)
Simon C
  • 316
  • 2
  • 5
2

You can do this:

def fn_generator():
    def fn(x, y):
        return x + y
    return fn

Now, transmit(fn_generator()) will send the actual definiton of fn(x,y) instead of a reference to the module name.

You can use the same trick to send classes across network.

Fardin Abdi
  • 1,284
  • 15
  • 20
1

The basic functions used for this module covers your query, plus you get the best compression over the wire; see the instructive source code:

y_serial.py module :: warehouse Python objects with SQLite

"Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data."

http://yserial.sourceforge.net

0

Here is a helper class you can use to wrap functions in order to make them picklable. Caveats already mentioned for marshal will apply but an effort is made to use pickle whenever possible. No effort is made to preserve globals or closures across serialization.

    class PicklableFunction:
        def __init__(self, fun):
            self._fun = fun

        def __call__(self, *args, **kwargs):
            return self._fun(*args, **kwargs)

        def __getstate__(self):
            try:
                return pickle.dumps(self._fun)
            except Exception:
                return marshal.dumps((self._fun.__code__, self._fun.__name__))

        def __setstate__(self, state):
            try:
                self._fun = pickle.loads(state)
            except Exception:
                code, name = marshal.loads(state)
                self._fun = types.FunctionType(code, {}, name)
memeplex
  • 2,297
  • 27
  • 26