1

The documentation linked below seems to say that top level classes can be pickled, as well as their instances. But based on the answers to my previous question it seem not to be correct. In the script I posted the pickle accepts the class object and writes a file, but this is not useful.

THIS IS MY QUESTION: Is this documentation wrong, or is there something more subtle I don't understand? Also, should pickle be generating some kind of error message in this case?

https://docs.python.org/2/library/pickle.html#what-can-be-pickled-and-unpickled,

The following types can be pickled:

  • None, True, and False
  • integers, long integers, floating point numbers, complex numbers
  • normal and Unicode strings
  • tuples, lists, sets, and dictionaries containing only picklable objects
  • functions defined at the top level of a module
  • built-in functions defined at the top level of a module
  • classes that are defined at the top level of a module ( my bold )
  • instances of such classes whose dict or the result of calling getstate() > is picklable (see section The pickle protocol for details).
Community
  • 1
  • 1
uhoh
  • 3,713
  • 6
  • 42
  • 95
  • It seems to me that the documentation clearly explains what it means to pickle a class, and your quarrel is with whether that's what it *ought* to mean to pickle a class. – nthall Dec 16 '15 at 14:10
  • No quarrel intended. I am trying to understand what exactly it means to "pickle a class." If it is clear to you, could you post something as an answer? I originally thought that I could recover the definition so I could create more instances, but that seems to be wrong. – uhoh Dec 16 '15 at 14:18

2 Answers2

4

Make a class that is defined at the top level of a module:

foo.py:

class Foo(object): pass

Then running a separate script,

script.py:

import pickle
import foo


with open('/tmp/out.pkl', 'w') as f:
    pickle.dump(foo.Foo, f)

del foo

with open('/tmp/out.pkl', 'r') as f:
    cls = pickle.load(f)

print(cls)

prints

<class 'foo.Foo'>

Note that the pickle file, out.pkl, merely contains strings which name the defining module and the name of the class. It does not store the definition of the class:

cfoo
Foo
p0
.

Therefore, at the time of unpickling the defining module, foo, must contain the definition of the class. If you delete the class from the defining module

del foo.Foo

then you'll get the error

AttributeError: 'module' object has no attribute 'Foo'
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • Thank you! Indeed, that was my experience in my previous [question](http://stackoverflow.com/q/34261379/3904031) a few days ago. So I'm getting closer to understanding, can you help me understand what is the utility of pickling a class without its definition. Is it along the lines of just managing the name of the class? – uhoh Dec 16 '15 at 14:47
  • See comment by @justhalf below that [answer](http://stackoverflow.com/a/34262880/3904031) – uhoh Dec 16 '15 at 15:24
  • 2
    @uhoh: Pickling a class is essentially the same as storing the name of the class (as a string) and the name of the defining module (as a string). If you were to just store the class name and module name as strings, then you could use `module = __import__(module_name)`, `cls = getattr(module, class_name)` instead of pickling. I tend to only use pickling (implicitly) when using the multiprocessing module -- since all objects passed through `Queue`s get pickled. For long-term persistence, I tend to use other mechanisms like `JSON` or a database to avoid version or language incompatibility. – unutbu Dec 16 '15 at 19:10
  • Thank you for taking the time to explain even further. This is really helpful! – uhoh Dec 17 '15 at 06:10
1

It's totally possible to pickle a class instance in python… while also saving the code to reconstruct the class and the instance's state. If you want to hack together a solution on top of pickle, or use a "trojan horse" exec based method here's how to do it:

How to unpickle an object whose class exists in a different namespace (python)?

Or, if you use dill, you have a dump function that already knows how to store a class instance, the class code, and the instance state:

How to recover a pickled class and its instances

Pickle python class instance plus definition

I'm the dill author, and I created dill in part to be able to ship class instances and class methods across multiprocessing.

Can't pickle <type 'instancemethod'> when using python's multiprocessing Pool.map()

Community
  • 1
  • 1
Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
  • Thanks @Mike. The question refers to the pickle module proper and its documentation. Now I see that the term "pickling" is can be a more broadly used to other pickle-based and pickle-like procedures. Good to collect these links together here. – uhoh Dec 22 '15 at 15:29
  • Understood… Note that my first link shows how to do exactly what `dill` does, but only using the `pickle` module. If you register a serialization method for classes, then classes can serialize. The docs are with regard to `pickle`, without any user extensions using `copyreg`. – Mike McKerns Dec 22 '15 at 16:11