I'm working with a code that is calculating hashes of lists of objects, algorithm was taken from this question: Quick and Simple Hash Code Combinations. Based on second answer values for seed and factor are 1009 and 9176. It works ok for calculating hashes of random lists of integers, but I've found that it simply does not work when lists are similar.
If we create a list of 20 random integers and calculate the hash using:
int[] hashCodes = {
-1641555406,
1406166370,
431811193,
-719284004,
-463280747,
138136561,
-1634028130,
-792182888,
1325264708,
2143865166,
25622596,
-977152280,
1955313253,
-1440973864,
1627089736,
1733757615,
-576076691,
-145918914,
1015082677,
-954685337,
-1307289157
};
int hashCode = 1009;
foreach (var c in hashCodes)
hashCode = hashCode * 9176 + c;
And than change only the first number:
hashCodes[0] = -145574454;
hashCode = 1009;
foreach (var c in hashCodes)
hashCode = hashCode * 9176 + c;
we will end up with the same hash code. Result is the same for any random list of integers - if only the first number is different, we end up with the same hash code around 8-10 iteration.
I believe it's due to integer overflow and truncating highest bits, but I'm not sure. I tried using seed and factor based on the first answer (17 and 31 respectively) and it worked fine. Why is that?
How should such hash (hash of list of integers) be calculated?
Edit: As per comment, this is not cryptographically secure hash and it's not used as such, it's just a way to assign unique integer key to lists of integers.