I've been using pickle.dumps
in order to create a hash for an arbitrary Python object, however, I've found out that dict/set orders aren't canonicalized and the result is therefore unreliable.
There are several related questions on SO and elsewhere, but I can't seem to find a hashing algorithm that uses the same basis for equality (__getstate__
/__dict__
results). I understand the basic requirements for rolling my own, but obviously I'd much prefer to use something that's been tested.
Does such a library exist? I suppose what I'm actually asking for is a library that serializes objects deterministically (using __getstate__
and __dict__
) so that I can hash the output.
EDIT
To clarify, I'm looking for something different than the values returned by Python's hash
(or __hash__
). What I want is essentially a checksum for arbitrary objects which may or may not be hashable. This value should vary based on objects' state. (I'm using "state" to refer to the dict retuned by __getstate__
or, if that's not present, the object's __dict__
.)