1

I encounter a really strange problem while trying to clone a function in python, using How to create a copy of a python function technique

A minimal code that show the issue :

import dill
import pickle
import types

def foo():
    print ('a')

fooCopy=types.FunctionType(foo.__code__, foo.__globals__, 'IAmFooCopied',foo.__defaults__ , foo.__closure__)

print ( 'printing foo and the copy', fooCopy, foo )
print ( 'dill output: ', dill.dumps(fooCopy ))
print ( 'pickle Output: ', pickle.dumps (fooCopy) )

Output:

printing foo and the copy <function foo at 0x7fb6ec6349d8> <function foo at 0x7fb6ed41a268>
dill output:  b'\x80\x03cdill._dill\n_create_function\nq\x00(cdill._dill\n_load_type\nq\x01X\x08\x00\x00\x00CodeTypeq\x02\x85q\x03Rq\x04(K\x00K\x00K\x00K\x02KCC\x0ct\x00d\x01\x83\x01\x01\x00d\x00S\x00q\x05NX\x01\x00\x00\x00aq\x06\x86q\x07X\x05\x00\x00\x00printq\x08\x85q\t)X\x10\x00\x00\x00testCloneFunc.pyq\nX\x03\x00\x00\x00fooq\x0bK\x05C\x02\x00\x01q\x0c))tq\rRq\x0ec__builtin__\n__main__\nX\x0c\x00\x00\x00IAmFooCopiedq\x0fNN}q\x10tq\x11Rq\x12.'
Traceback (most recent call last):
  File "testCloneFunc.py", line 12, in <module>
    print ( 'pickle Output: ', pickle.dumps (fooCopy) )
_pickle.PicklingError: Can't pickle <function foo at 0x7fb6ec6349d8>: it's not the same object as __main__.foo

The first thing that I found strange, if that If you print the copy, you get the same name as the original, where I expected it to be 'IAmFooCopied'.

Then for the error I guess pickle is also tricked into thinking the two objects are the sames.

Some docs about this pickle error: https://code.google.com/archive/p/modwsgi/wikis/IssuesWithPickleModule.wiki

But I really don't understand why pickle cannot see that theses two functions are not the sames. Is there any quick fix I can use?

Edit: It seem that the the name argument of the FunctionType do not set co_name of the function, nether it set qualname. So by recreating a code object, I fixed the old error just to encounter this one:

import dill
import pickle
import types

def foo():
    print ('a')


oldCode=foo.__code__

name='IAmFooCopied'

newCode= types.CodeType(
        oldCode.co_argcount,             #   integer
        oldCode.co_kwonlyargcount,       #   integer
        oldCode.co_nlocals,              #   integer
        oldCode.co_stacksize,            #   integer
        oldCode.co_flags,                #   integer
        oldCode.co_code,                 #   bytes
        oldCode.co_consts,               #   tuple
        oldCode.co_names,                #   tuple
        oldCode.co_varnames,             #   tuple
        oldCode.co_filename,             #   string
        name,                  #   string
        oldCode.co_firstlineno,          #   integer
        oldCode.co_lnotab,               #   bytes
        oldCode.co_freevars,             #   tuple
        oldCode.co_cellvars              #   tuple
        )


fooCopy=types.FunctionType(newCode, foo.__globals__, name,foo.__defaults__ , foo.__closure__)

fooCopy.__qualname__= name

print ( 'printing foo and the copy', fooCopy, foo )
print ( 'dill output: ', dill.dumps(fooCopy ))
print ( 'pickle Output: ', pickle.dumps (fooCopy) )

New output:

printing foo and the copy <function IAmFooCopied at 0x7fee8ebb19d8> <function foo at 0x7fee8f996268>
dill output:  b'\x80\x03cdill._dill\n_create_function\nq\x00(cdill._dill\n_load_type\nq\x01X\x08\x00\x00\x00CodeTypeq\x02\x85q\x03Rq\x04(K\x00K\x00K\x00K\x02KCC\x0ct\x00d\x01\x83\x01\x01\x00d\x00S\x00q\x05NX\x01\x00\x00\x00aq\x06\x86q\x07X\x05\x00\x00\x00printq\x08\x85q\t)X\x10\x00\x00\x00testCloneFunc.pyq\nX\x0c\x00\x00\x00IAmFooCopiedq\x0bK\x05C\x02\x00\x01q\x0c))tq\rRq\x0ec__builtin__\n__main__\nh\x0bNN}q\x0ftq\x10Rq\x11.'
Traceback (most recent call last):
  File "testCloneFunc.py", line 38, in <module>
    print ( 'pickle Output: ', pickle.dumps (fooCopy) )
_pickle.PicklingError: Can't pickle <function IAmFooCopied at 0x7fee8ebb19d8>: attribute lookup IAmFooCopied on __main__ failed

Also, dill.detect fail to detect any problem.

ninjaconcombre
  • 456
  • 4
  • 15

2 Answers2

1

I'm not sure what you want to do here... but just to be clear -- dill works as expected.

>>> import dill                                              
>>> import pickle
>>> import types
>>> 
>>> def foo():
...     print ('a')
... 
>>> fooCopy=types.FunctionType(foo.__code__, foo.__globals__, 'IAmFooCopied',foo.__defaults__ , foo.__closure__)
>>>
>>> dill.loads(dill.dumps(foo))
<function foo at 0x1058172a8>
>>> dill.loads(dill.dumps(fooCopy))
<function IAmFooCopied at 0x105817320>
>>> 

pickle fails because it generally fails to serialize user-built functions in many cases as it serializes the functions by reference (i.e. reference to the module it was built in). You can see in the dumped string, pickle basically stores a string that amounts to a prefix of which serialization version to use, then the name of the module (__main__), then the name of the function ('IAmFooCopied'). dill, on the other hand, does exactly what you are doing by hand. See save_function and _create_function here: https://github.com/uqfoundation/dill/blob/master/dill/_dill.py.

Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
0

Here is the code that work:

import pickle
import dill
import types

def foo():
    print ('a')


oldCode=foo.__code__

name='IAmFooCopied'

newCode= types.CodeType(
        oldCode.co_argcount,             #   integer
        oldCode.co_kwonlyargcount,       #   integer
        oldCode.co_nlocals,              #   integer
        oldCode.co_stacksize,            #   integer
        oldCode.co_flags,                #   integer
        oldCode.co_code,                 #   bytes
        oldCode.co_consts,               #   tuple
        oldCode.co_names,                #   tuple
        oldCode.co_varnames,             #   tuple
        oldCode.co_filename,             #   string
        name,                  #   string
        oldCode.co_firstlineno,          #   integer
        oldCode.co_lnotab,               #   bytes
        oldCode.co_freevars,             #   tuple
        oldCode.co_cellvars              #   tuple
        )

IAmFooCopied=types.FunctionType(newCode, foo.__globals__, name,foo.__defaults__ , foo.__closure__)
IAmFooCopied.__qualname__= name
print ( 'printing foo and the copy', IAmFooCopied, foo )
print ( 'dill output: ', dill.dumps(IAmFooCopied ))
print ( 'pickle Output: ', pickle.dumps (IAmFooCopied) )

If the definition of the function (IAmFooCopied=types.FunctionType(newCode, foo.__globals__, name,foo.__defaults__ , foo.__closure__) and the name='IAmFooCopied' do not match, pickle cant find the function to serialize.

Output:

printing foo and the copy <function IAmFooCopied at 0x7f8a6a8159d8> <function foo at 0x7f8a6b5f5268>
dill output:  b'\x80\x03cdill._dill\n_create_function\nq\x00(cdill._dill\n_load_type\nq\x01X\x08\x00\x00\x00CodeTypeq\x02\x85q\x03Rq\x04(K\x00K\x00K\x00K\x02KCC\x0ct\x00d\x01\x83\x01\x01\x00d\x00S\x00q\x05NX\x01\x00\x00\x00aq\x06\x86q\x07X\x05\x00\x00\x00printq\x08\x85q\t)X\x10\x00\x00\x00testCloneFunc.pyq\nX\x0c\x00\x00\x00IAmFooCopiedq\x0bK\x05C\x02\x00\x01q\x0c))tq\rRq\x0ec__builtin__\n__main__\nh\x0bNN}q\x0ftq\x10Rq\x11.'
pickle Output:  b'\x80\x03c__main__\nIAmFooCopied\nq\x00.'
ninjaconcombre
  • 456
  • 4
  • 15