0

I created a shelf python dictionary using python2 and want to access it using python3. also is there a way to keep it backward compatible, as in once you have pickled it using python3 can you go back and access it using python2

1 Answers1

0

Have a look to the protocol parameter of pickle.dump, pickle.dumps (and shelve.open) methods (fix_imports is set by efault to True).

I will focus on pickle, since shelve is backed by pickle.

Here's a dump of an object pickled using protocol 0 (default protocol for my CPython 2.7.17):

>>> s = "(dp0\nS'a'\np1\n(lp2\nI0\naI1\naI2\naI3\naI4\nas."

You can load it using Python 3.6 (the protocol 0 is handled but you have to convert the string to bytes):

>>> import pickle
>>> bs = s.encode("ascii")
>>> obj = pickle.loads(bs)
>>> obj == {'a': [0, 1, 2, 3, 4]}
True

But if you want to pickle it again using the default protocol of Python 3.6 (protocol 3), you have:

>>> pickle.dumps(obj)
b'\x80\x03}q\x00X\x01\x00\x00\x00aq\x01]q\x02(K\x00K\x01K\x02K\x03K\x04es.'

Python 2 won't be able to load this dump! (ValueError: unsupported pickle protocol: 3)

But if you specify a protocol that Python2 knows, e.g. protocol = 2:

>>> bs = pickle.dumps(obj, protocol=2)
b'\x80\x02}q\x00X\x01\x00\x00\x00aq\x01]q\x02(K\x00K\x01K\x02K\x03K\x04es.'

The representation is a bit different, but Python 2 will be able to load this dump.


To summarize:

  • use pickle.loads(s.encode("ascii")) to load an object dumped with Python 2
  • use pickle.dumps(obj, protocol=2).decode("ascii") to dump an object, using the higher protocol handled by Python 2.

I you want to stick with the original protocol, you can use the pickletools module to find the protocol, but this is a bit hacky:

def get_protocol(dump):
    """
    :returns:   the protocol of the dump
    :raises:    ValueError if the byte are not a pickle dump.
    :raises:    StopIteration if the protocol was not found
    """
    import pickletools
    op, _, _ = next(pickletools.genops(dump))
    if op.proto < 2:
        return op.proto
    else:
        return next((arg for opcode, arg, _ in pickletools.genops(dump) if opcode.name == "PROTO"))

>>> for p in range(5):
...     bs = pickle.dumps(obj, protocol=p)
...     print(get_protocol(bs), end='')
01234

Now, you can do:

original_protocol = get_protocol(bs)
obj = pickle.loads(bs)
# do something with `obj`
bs = pickle.dumps(obj, protocol=original_protocol)

(That doesn't guarantee that the representation will be exactly the same.)

jferard
  • 7,835
  • 2
  • 22
  • 35
  • Thank you for your quick solution. I will try it out, however is it possible to convert shelve to json and back and forth – aadithya venkat Jun 17 '20 at 00:27
  • @aadithyavenkat See https://stackoverflow.com/a/2259313/6914441 for pickle/shelve vs JSON. To summarize : pickle is for Python and only Python, while JSON can be handled by most of the usual languages. – jferard Jun 17 '20 at 17:05