6

I've been trying to get some dynamically created types (i.e. ones created by calling 3-arg type()) to pickle and unpickle nicely. I've been using this module switching trick to hide the details from users of the module and give clean semantics.

I've learned several things already:

  1. The type must be findable with getattr on the module itself
  2. The type must be consistent with what getattr finds, that is to say if we call pickle.dumps(o) then it must be true that type(o) == getattr(module, 'name of type')

Where I'm stuck though is that there still seems to be something odd going on - it seems to be calling __getstate__ on something unexpected.

Here's the simplest setup I've got that reproduces the issue, testing with Python 3.5, but I'd like to target back to 3.3 if possible:

# module.py
import sys
import functools

def dump(self):
    return b'Some data' # Dummy for testing

def undump(self, data):
    print('Undump: %r' % data) # Do nothing for testing

# Cheaty demo way to make this consistent
@functools.lru_cache(maxsize=None)
def make_type(name):
    return type(name, (), {
        '__getstate__': dump,
        '__setstate__': undump,
    })

class Magic(object):
    def __init__(self, path):
        self.path = path

    def __getattr__(self, name):
        print('Getting thing: %s (from: %s)' % (name, self.path))
        # for simple testing all calls to make_type must end in last x.y.z.last
        if name != 'last':
            if self.path:
                return Magic(self.path + '.' + name)
            else:
                return Magic(name)
        return make_type(self.path + '.' + name)

# Make the switch
sys.modules[__name__] = Magic('')

And then a quick way to exercise that:

import module
import pickle

f=module.foo.bar.woof.last()
print(f.__getstate__()) # See, *this* works
print('Pickle starts here')
print(pickle.dumps(f))

Which then gives:

Getting thing: foo (from: )
Getting thing: bar (from: foo)
Getting thing: woof (from: foo.bar)
Getting thing: last (from: foo.bar.woof)
b'Some data'
Pickle starts here
Getting thing: __spec__ (from: )
Getting thing: _initializing (from: __spec__)
Getting thing: foo (from: )
Getting thing: bar (from: foo)
Getting thing: woof (from: foo.bar)
Getting thing: last (from: foo.bar.woof)
Getting thing: __getstate__ (from: foo.bar.woof)
Traceback (most recent call last):
  File "test.py", line 7, in <module>
    print(pickle.dumps(f))
TypeError: 'Magic' object is not callable

I wasn't expecting to see anything looking up __getstate__ on module.foo.bar.woof, but even if we force that lookup to fail by adding:

if name == '__getstate__': raise AttributeError()

into our __getattr__ it still fails with:

Traceback (most recent call last):
  File "test.py", line 7, in <module>
    print(pickle.dumps(f))
_pickle.PicklingError: Can't pickle <class 'module.Magic'>: it's not the same object as module.Magic

What gives? Am I missing something with __spec__? The docs for __spec__ pretty much just stress setting it appropriately, but don't seem to actually explain much.

More importantly the bigger question is how am I supposed to go about making types I programatically generated via a pseudo module's __getattr__ implementation pickle properly?

(And obviously once I've managed to get pickle.dumps to produce something I expect pickle.loads to call undump with the same thing)

Flexo
  • 87,323
  • 22
  • 191
  • 272
  • Is there a reason you need to use the module switching trick? I simplified your example a bit and it works fine with `module.Magic('').foo.bar.woof.last()`. The module switch causes the `it's not the same object` error. I'm still working out the details of what causes all these problems though. – jeremye Sep 09 '17 at 00:12

2 Answers2

1

To pickle f, pickle needs to pickle f's class, module.foo.bar.woof.last.

The docs don't claim support for pickling arbitrary classes. They claim the following:

The following types can be pickled:

  • ...
  • classes that are defined at the top level of a module

module.foo.bar.woof.last isn't defined at the top level of a module, even a pretend module like module. In this not-officially-supported case, the pickle logic ends up trying to pickle module.foo.bar.woof, either here:

    elif parent is not module:
        self.save_reduce(getattr, (parent, lastname))

or here

    else if (parent != module) {
        PickleState *st = _Pickle_GetGlobalState();
        PyObject *reduce_value = Py_BuildValue("(O(OO))",
                                    st->getattr, parent, lastname);
        status = save_reduce(self, reduce_value, NULL);

module.foo.bar.woof can't be pickled for multiple reasons. It returns a non-callable Magic instance for all unsupported method lookups, like __getstate__, which is where your first error comes from. The module-switching thing prevents finding the Magic class to pickle it, which is where your second error comes from. There are probably more incompatibilities.

Community
  • 1
  • 1
user2357112
  • 260,549
  • 28
  • 431
  • 505
0

As it seems, and is already proven that making the class callable is just a drifting out another wrong direction, thankfully to this hack, I could find a getaround to make the class reiterable by its TYPE. following the context of the error <class 'module.Magic'>: it's not the same object as module.Magic the pickler doesn't iterate through the same call that renders a different type from the other one, this is a major common problem with pickling self instanciating classes, for this instance, an object by its class, there for the solution is patching the class with its type @mock.patch('module.Magic', type(module.Magic)) this is a short answer for a something.

Main.py

import module
import pickle
import mock


f=module1.foo.bar.woof.last
print(f().__getstate__()) # See, *this* works
print('Pickle starts here')
@mock.patch('module1.Magic', type(module1.Magic))
def pickleit():
    return pickle.dumps(f())
print(pickleit())

Magic class

class Magic(object):

    def __init__(self, value):
        self.path = value

    __class__: lambda x:x

    def __getstate__(self):
        print ("Shoot me! i'm at " +  self.path )
        return dump(self)

   def __setstate__(self,value):
        print ('something will never occur')
        return undump(self,value)

    def __spec__(self):
        print ("Wrong side of the planet ")

    def _initializing(self):
        print ("Even farther lost ")

     def __getattr__(self, name):
        print('Getting thing: %s (from: %s)' % (name, self.path))
        # for simple testing all calls to make_type must end in last x.y.z.last
        if name != 'last':
            if self.path:
                return Magic(self.path + '.' + name)
            else:
                return Magic(name)
        print('terminal stage' )
        return make_type(self.path + '.' + name)

Even assuming this is not more of striking the ball by the edge of the bat, I could see the content dumped into my console.

Abr001am
  • 571
  • 6
  • 19