4

I'm using version 1.2 (currently the latest) of the ordered_set module linked to from this answer. I've been getting some weird behavior and have traced it to this:

from ordered_set import OrderedSet
import pickle

os_orig = OrderedSet()
print os_orig # 'OrderedSet'
print os_orig.items # '[]'
pickled = pickle.dumps(os_orig)
loaded = pickle.loads(pickled)
print loaded

Which raises AttributeError: 'OrderedSet' object has no attribute 'items'. Everything goes fine if the OrderedSet is not empty.

Unfortunately I am in over my head here when it comes to pickle--what is going wrong?

EDIT: I should add that the module seems to support pickle. From the README: "added a __getstate__ and __setstate__ so it can be pickled"

Community
  • 1
  • 1
kuzzooroo
  • 6,788
  • 11
  • 46
  • 84

1 Answers1

8

The pickling support for OrderedSet breaks when the set is empty, because the state returned by __getstate__ is essentially empty:

>>> OrderedSet().__getstate__()
[]

The pickle module ends up not calling __setstate__ when loading the pickle again because __getstate__'s return value is empty. Not calling __setstate__ means the OrderedSet.__init__() method never gets called and you have a broken object. See the __setstate__ documenation:

Note: For new-style classes, if __getstate__() returns a false value, the __setstate__() method will not be called.

An empty list is a false value.

The author must've tested only with pickling non-empty OrderedSet() instances, which works fine.

You can fix the issue by replacing the __getstate__ and __setstate__ methods:

def __getstate__(self):
    return (list(self),)
OrderedSet.__getstate__ = __getstate__

def __setstate__(self, state):
    if isinstance(state, tuple):
        state = state[0]
    self.__init__(state)
OrderedSet.__setstate__ = __setstate__

Now a non-empty, 1-element tuple is returned, forcing pickle to call __setstate__ even for the empty set. The __setstate__ will still accept the previous pickle format, a list object.

I've reported this as a bug with the project, since closed as resolved.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • `bool(list(),)` (or `bool([],)`) are also a False values. – martineau Jul 20 '14 at 03:50
  • @martineau, what are the implications of your comment? I get output from `if (list(),): print "hey"` but not `if bool(list(),): print "hey"`, but I think the former is what counts in this context. Martijn Pieters' code seems to fix the issue for me. – kuzzooroo Jul 20 '14 at 03:59
  • @martineau: but `bool(([],))` is **not** empty. You do need the parentheses there, without those you are passing in *just* an empty list, not a tuple with one element. – Martijn Pieters Jul 20 '14 at 08:43
  • @kuzzooroo martineau has overlooked the fact that in a call signature you have to have the parentheses around a tuple. `bool([],)` is not passing in a tuple but *just* a single empty list. – Martijn Pieters Jul 20 '14 at 08:55
  • @kuzzooroo: Martijn's correct, my mistake. I got confused due to the lack of the extra set of parentheses. Sorry about that. – martineau Jul 20 '14 at 14:08
  • Thanks for the bug report -- I've fixed it. I never knew that pickle states had to be truthy objects before this. – rspeer Jul 21 '14 at 19:01