0

I came across some weird behavior where pickle was producing different serialized output for the same input. It is somehow triggered when serializing a deserialized pickled object (as could happen if the object was retrieved from mp.Queue). It also only happens when the list has more than one element (not sure why). Can someone explain what is going on here? I have read this question but it doesn't seem to be the same situation, as a,b,c all have different ids.

Code to reproduce below. I'm using Python 3.6.3 from Anaconda.

import pickle
import numpy as np

def f():
    return np.array([1,2,3])

a = f()
b = f()
c = pickle.loads(pickle.dumps(f()))

print(id(a), id(b), id(c)) # 4464565184 4464565504 4466959424

print('[a]==[b]', pickle.dumps([a]) == pickle.dumps([b])) # True
print('[a]==[c]', pickle.dumps([a]) == pickle.dumps([c])) # True
print('[b]==[c]', pickle.dumps([b]) == pickle.dumps([c])) # True
print('[a,b]==[a,c]', pickle.dumps([a,b]) == pickle.dumps([a,c])) # False
print('a.tolist()==c.tolist()', a.tolist()==c.tolist()) # True
krasnaya
  • 2,995
  • 3
  • 21
  • 19

1 Answers1

0

This seems to be problem specific to numpy as without it, pickle works just fine in python 3.6.6:

import pickle

def f():
    return [1,2,3]

a = f()
b = f()
c = pickle.loads(pickle.dumps(f()))

print(id(a), id(b), id(c)) # 1900503702728 1900503865288 1900503976328

print('[a]==[b]', pickle.dumps([a]) == pickle.dumps([b])) # True
print('[a]==[c]', pickle.dumps([a]) == pickle.dumps([c])) # True
print('[b]==[c]', pickle.dumps([b]) == pickle.dumps([c])) # True
print('[a,b]==[a,c]', pickle.dumps([a,b]) == pickle.dumps([a,c])) # True