Consider the following python code:
import cloudpickle
class Foo:
def __init__(self, num):
self.num = num
def outer(num):
return Foo(num)
print(cloudpickle.dumps(outer))
This produces a different pickle everytime you run the code. Analysing the pickle file using pickletools
shows the following diff:
144c144
< 552: \x8c SHORT_BINUNICODE '2e3db4572bb349268962a75a8a6f034c'
---
> 552: \x8c SHORT_BINUNICODE '89ee770de9b745c4bbe83c353f1debba'
Now, I understand that cloudpickle doesn't guarantee determinism of the pickle files. (link), but I am curious why these two pickle files are different. It looks like the difference above is because of some sort of different hash for the Foo
class.
Note that I ran the python program with a fixed PYTHONHASHSEED
.
PS: This is enough to reproduce the issue:
import pickletools
import cloudpickle
class Foo:
def __init__(self, num):
self.num = num
pickletools.dis(cloudpickle.dumps(Foo))
So it seems that each class has a property which gets baked into the cloudpickle, but I don't know what that property is.