1

I am using the pickle library to serialise a custom object, let's call it A, which is defined in a.py.

If I pickle an object of type A in a.py as follows:

import pickle

class A:
    ...

if __name__ == "__main__":
    inst = A("some param")
    with open("a.pickle", 'wb') as dumpfile:
        pickle.dump(inst, dumpfile)

Then there is a problem with loading this object from storage if the module A is not explicitly in the namespace __main__. See this other question. This is because pickle knows that it should look for the class A in __main__, since that's where it was when pickle.dump() happened.

Now, there are two approaches to dealing with this:

  1. Deal with it at the deserialisation end,
  2. Deal with it at the serialisation end.

For 1, there are various options (see the above link, for example), but I want to avoid these, because I think it makes sense to give the 'pickler' responsibility regarding its data.

For 2, we could just avoid pickling when the module is under the __main__ namespace, but that doesn't seem very flexible. We could alternatively modify A.__module__, and set it to the name of the module (as done here). Pickle uses this __module__ variable to find where to import the class A from, so setting it before .dump() works:

if __name__ == "__main__":
    inst = A("some param")
    A.__module__ = 'a'
    with open("a.pickle", 'wb') as dumpfile:
        pickle.dump(inst, dumpfile)

Q: is this a good idea? It seems like it's implementation dependent, not interface dependent. That is, pickle could decide to use another method of locating modules to import, and this approach would break. Is there an alternative that uses pickle's interface?

Useful_Investigator
  • 938
  • 2
  • 8
  • 27
  • Does this answer your question? [Store object using Python pickle, and load it into different namespace](https://stackoverflow.com/questions/26696695/store-object-using-python-pickle-and-load-it-into-different-namespace) – Jacques Gaudin Feb 15 '22 at 15:09
  • @JacquesGaudin Thanks, `dill` provided in that question is probably a good choice. I missed that question somehow, both early on and on reviewing this q. Happy to close as a duplicate pointing there, but perhaps there is some insight to be had about modifying `__module__`? – Useful_Investigator Feb 15 '22 at 15:18

1 Answers1

1

Another way around it would be to import the file itself:

import pickle
import a

class A:
    pass

def main():
    inst = a.A()
    print(inst.__module__)
    with open("a.pickle", 'wb') as dumpfile:
        pickle.dump(inst, dumpfile)

if __name__ == "__main__":
    main()

Note that it works because the import statement is purely an assignment of the name a to a module object, it doesn't go to infinite recursion when you import a within a.py.

Jacques Gaudin
  • 15,779
  • 10
  • 54
  • 75