When I run script.py below in Spyder (IPython console), I get different values for b
's hash between runs, whereas a
's hash stays the same.
script.py
from sympy import Symbol
import numpy as np
x = Symbol("x")
a = np.array([ 1, x])
b = np.array([1.0, x])
print(hash(a.tobytes()))
print(hash(b.tobytes()))
Running this script, yields the following outputs.
In [1]: runfile(./script.py)
-1258340495102975319
3795610135772286033
In [2]: runfile(./script.py)
-1258340495102975319
7432739601143179777
In [3]: runfile(./script.py)
-1258340495102975319
1451381667883822748
In [4]: runfile(./script.py)
-1258340495102975319
2683979045255549228
In [5]: runfile(./script.py)
-1258340495102975319
-345973347917904018
Please may someone shed some light on this strange behaviour, and possibly suggest a solution to give cosistent hashes for the floating point case.
I've tried simulating the same behaviour inside a for loop but in this case, the hashes are consistent. The difference occur when the code is ran multiple times.
I wouldn't expect the hashes of a
and b
to be the same, since 1
and 1.0
are different data types, but I would expect the hashes of both to be consistent over runs (as a
is).
The problem isn't specific to hashing, the tobytes() method gives different results over each run but the hash gives a more obvious representation of the differences.
EDIT: After a little more testing, I've realised that the problem is not just specific to SymPy, but the same behaviour also happens with a NumPy array with an object data type. For instance print(hash(np.array([1.0], dtype="O").tobytes()))
gives different results over different runs.
EDIT2: There is still some unexplained behaviour given the pointers answers, as this behaviour is only specific to arrays with object data type.
In [2]: hash(np.array([1.0]).tobytes())
Out[2]: -1405879698645296540
In [3]: hash(np.array([1.0]).tobytes())
Out[3]: -1405879698645296540
In [4]: hash(np.array([1.0]).tobytes())
Out[4]: -1405879698645296540
In [5]: hash(np.array([1.0]).tobytes())
Out[5]: -1405879698645296540
In [6]: hash(np.array([1.0], dtype="O").tobytes())
Out[6]: 7075328050134915067
In [7]: hash(np.array([1.0], dtype="O").tobytes())
Out[7]: -6443853770133964536
In [8]: hash(np.array([1.0], dtype="O").tobytes())
Out[8]: 889083274033361878
In [9]: hash(np.array([1.0], dtype="O").tobytes())
Out[9]: -6819397306369441685