I've noticed that hash values created from Pandas DataFrames change depending whether the below snippet is executed on Unix or Windows.
import pandas as pd
import numpy as np
import hashlib
df = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
columns=['a', 'b', 'c'])
hashvalue_new = hashlib.md5(df.values.flatten().data).hexdigest()
print(hashvalue_new)
The above code prints d0ecb84da86002807de1635ede730f0a
on Windows machines and 586962852295d584ec08e7214393f8b2
on Unix machines. Can someone more knowledgeable (or smarter) than me explain to me why this is happening and suggest a way to create a consistent hash value across platforms? I'm running Python 3.8.5 and pandas 1.2.5.