8

My module contains a class which should be pickleable, both instance and definition I have the following structure:

MyModule
|-Submodule
  |-MyClass

In other questions on SO I have already found that dill is able to pickle class definitions and surely enough it works by copying the definition of MyClass into a separate script and pickling it there, like this:

import dill as pickle

class MyClass(object):
    ...

instance = MyClass(...)
with open(..., 'wb') as file:
   pickle.dump(instance, file)

However, it does not work when importing the class:

Pickling:

from MyModule.Submodule import MyClass
import dill as pickle

instance = MyClass(...)
with open(.., 'wb') as file:
    pickle.dump(instance, file)

Loading:

import dill as pickle

with open(..., 'rb') as file:
    instance = pickle.load(file)

>>> ModuleNotFoundError: No module named 'MyModule'

I think the class definition is saved by reference, although it should not have as per default settings in dill. This is done correctly when MyClass is known as __main__.MyClass, which happens when the class is defined in the main script.

I am wondering, is there any way to detach MyClass from MyModule? Any way to make it act like a top level import (__main__.MyClass) so dill knows how to load it on my other machine?

Relevant question: Why dill dumps external classes by reference no matter what

vriesdemichael
  • 914
  • 1
  • 7
  • 19

3 Answers3

3

Dill indeed only stores definitions of objects in __main__, and not those in modules, so one way around this problem is to redefine those objects in main:

def mainify(obj):
    import __main__
    import inspect
    import ast

    s = inspect.getsource(obj)
    m = ast.parse(s)
    co = compile(m, '<string>', 'exec')
    exec(co, __main__.__dict__)

And then:

from MyModule.Submodule import MyClass
import dill as pickle

mainify(MyClass)
instance = MyClass(...)
with open(.., 'wb') as file:
    pickle.dump(instance, file)

Now you should be able to load the pickle from anywhere, even where the MyModule.Submodule is not available.

oegedijk
  • 31
  • 2
1

I'm the dill author. This is a duplicate of the question you refer to above. The relevant GitHub feature request is: https://github.com/uqfoundation/dill/issues/128.

I think the larger issue is that you want to pickle an object defined in another file that is not installed. This is currently not possible, I believe.

As a workaround, I believe you may be able to pickle with dill.source by extracting the source code of the class (or module) and pickling that dynamically, or extracting the source code and compiling a new object in __main__.

Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
  • I'll potentially update this answer with a test case, If I build one that I can demonstrate my hypothesis. – Mike McKerns Sep 19 '18 at 16:59
  • Since you've never updated your answer, I take it your hypothesis didn't pan out. – martineau Nov 13 '21 at 19:47
  • @martineau: I still think the hypothesis is valid... I think I just never went back and built a test case. It's been a few years... – Mike McKerns Nov 13 '21 at 22:14
  • The question is still relevant. See for example the recent [Is there a way to serialize a class such that it can be unserialized independent of its original script?](https://stackoverflow.com/questions/69859147/is-there-a-way-to-serialize-a-class-such-that-it-can-be-unserialized-independent) – martineau Nov 15 '21 at 16:42
  • Sure, I get asked about this particular feature question fairly regularly. I only meant *the above post* is a bit stale, not that the issue itself is not relevant. Adding a file to `__main__` does work, as you have done Also, if the file is imported, it works. Issues are if the file is not in the PYTHONPATH, or if a non-installed module uses local imports. See https://stackoverflow.com/questions/31884640, and https://github.com/uqfoundation/dill/issues/123#issue-99914949. As I said above, I believe that `getsource` should also work in dire cases -- but I should verify that... – Mike McKerns Nov 16 '21 at 15:18
1

I managed to save the instance and definition of my class using the following dirty hack:

class MyClass(object):
    def save(path):
        import __main__

        with open(__file__) as f:
            code = compile(f.read(), "somefile.py", 'exec')
            globals = __main__.__dict__
            locals = {'instance': self, 'savepath': path}
            exec(code, globals, locals)

if __name__ == '__main__':
    # Script is loaded in top level, MyClass is now available under the qualname '__main__.MyClass'
    import dill as pickle

    # copy the attributes of the 'MyModule.Submodule.MyClass' instance to a bew 'MyClass' instance.
    new_instance = MyClass.__new__(MyClass)
    new_instance.__dict__ = locals()['instance'].__dict__

    with open(locals()['savepath'], 'wb') as f:       
        pickle.dump(new_instance, f)

Using the exec statement the file can be executed from within __main__, so the class definition will be saved as well. This script should not be executed as main script without using the save function.

vriesdemichael
  • 914
  • 1
  • 7
  • 19