11

So, I have an object that has quite a bit of non-pickleable things in it (pygame events, orderedDicts, clock, etc.) and I need to save it to disk.

Thing is, if I can just get this thing to store a string that has the progress (a single integer is all I need), then I can pass it to the object's init and it will rebuild all of those things. Unfortunately, a framework I am using (Renpy) will pickle the object and attempt to load it, despite the fact that I could save it as a single integer, and I can't change that.

So, what I'm asking is, how can I override methods so that whenever pickle tries to save the object, it saves only the progress value, and whenever it tries to load the object, it creates a new instance from the progress value?

I've seen a bit talking bout the __repr__ method, but I am unsure how I would use this in my situation.

Matthew Fournier
  • 1,077
  • 2
  • 17
  • 32

2 Answers2

13

The hook you're looking for is __reduce__. It should return a (callable, args) tuple; the callable and args will be serialized, and on deserialization, the object will be recreated through callable(*args). If your class's constructor takes an int, you can implement __reduce__ as

class ComplicatedThing:
    def __reduce__(self):
        return (ComplicatedThing, (self.progress_int,))

There are a few optional extra things you can put into the tuple, mostly useful for when your object graph has cyclic dependencies, but you shouldn't need them here.

Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
user2357112
  • 260,549
  • 28
  • 431
  • 505
  • So, it looks like the system is using pickle.loads instead of load. Will this still work? I'm getting an error TypeError: __init__() takes exactly 4 arguments (1 given) My main object doesn't need any arguments, but some of it's sub-objects do. Can I make pickle skip them? They'll be re-generated when the main object is initialized anyway. – Matthew Fournier Jun 07 '15 at 23:33
  • @MatthewFournier: `loads` or `load` shouldn't matter. Your object can be reconstructed from a single int; do you have a function to do that? If you do, that function should be the `callable` in the tuple. If you don't have such a function, you'll probably need to write one. – user2357112 Jun 07 '15 at 23:42
  • Right now, the top-level object's init function takes a single argument and rebuilds it. I'm thinking the error is referring to some of it's nested classes (since the error is stating that it needs four arguments). Is there a way I can make these objects ignored by the pickler? I don't need them saved at all. Can I have them reduce to None or something? – Matthew Fournier Jun 07 '15 at 23:47
  • @MatthewFournier: That's strange. I don't see how you'd be getting that error. This should already not be saving the objects you don't need saved. (My example code had an error where I forgot to wrap the argument in a tuple, but that would have produced a different error.) – user2357112 Jun 07 '15 at 23:55
  • Can pickle save and reload objects that have many arguments? Or do I need to rework everything to take a single argument list? – Matthew Fournier Jun 07 '15 at 23:57
  • @MatthewFournier: If your class takes 3 arguments `a`, `b`, and `c`, then returning `(ComplicatedObject, (a, b, c))` will cause your object to unpickle as `ComplicatedObject(a, b, c)`; you can pass any number of arguments, although there isn't support for keyword arguments. – user2357112 Jun 08 '15 at 00:00
  • Well, it looks like my problem lies elsewhere then. I'll mark this one as answered and make a new question, since it looks like a different issue entirely. – Matthew Fournier Jun 08 '15 at 00:12
4

While using __reduce__ is a valid way to do this, as the Python docs state:

Although powerful, implementing __reduce__() directly in your classes is error prone. For this reason, class designers should use the high-level interface (i.e., __getnewargs_ex__(), __getstate__() and __setstate__()) whenever possible

So, I'll explain how to use the simpler higher-level interfaces __getstate__ and __setstate__ to make an object picklable.

Let's take a very simple class with an unpicklable attribute, let's say it's a file handle.

class Foo:
    def __init__(self, filename):
        self.filename = filename
        self.f = open(filename) # this attribute cannot be pickled

Instances of Foo are not pickable:

obj = Foo('test.txt')
pickle.dumps(obj)
# TypeError: cannot pickle '_io.TextIOWrapper' object

We can make this class serializable and deserializable using pickle by implementing __getstate__ and __setstate__, respectively.

class Foo:
    ... # the class as it was
    def __getstate__(self):
       """Used for serializing instances"""
       
       # start with a copy so we don't accidentally modify the object state
       # or cause other conflicts
       state = self.__dict__.copy()

       # remove unpicklable entries
       del state['f']
       return state

    def __setstate__(self, state):
        """Used for deserializing"""
        # restore the state which was picklable
        self.__dict__.update(state)
        
        # restore unpicklable entries
        f = open(self.filename)
        self.f = f

Now it can be pickled:

obj = Foo('text.txt')
pickle.dumps(obj)
# b'\x80\x04\x951\x00\x00\x00\x00\x00\x00\x00\x8c\x08[...]'

Applying this idea to the example in your question, you might do something like this:

class MyComplicatedObject:
    def __getstate__(self):
        state = self.__dict__.copy()
        del state['progress'] # remove the unpicklable progress attribute
        return state
    def __setstate__(self, state):
        self.__dict__.update(state)
        # restore the progress from the progress integer
        self.progress = make_progress(self.progress_int)

Another way to do this would be to configure the pickler to know how to pickle new objects (rather than making the classes/objects themselves picklable). For example, with a custom pickler and dispatch_table you can register classes to functions (__reduce__-like) in order to pickle objects that may otherwise not be picklable.

In Python 3.8+ you can also implement custom reductions for objects.

These methods are particularly useful if you are trying to pickle classes that may belong to third party libraries/code where subclassing (to make the object picklable) is not practical.

sytech
  • 29,298
  • 3
  • 45
  • 86