0

I am trying to generate a unique hash code for a list of integers where their order matters.

For a simple example:

{0,1,2} and {0,2,1} are not equal so should produce different hashcodes.

This was my attempt:

private List<int> _test1;
private List<int> _test2;

With the hash created like so:

    _hash1=_hash2=0; 
    for (int i = 0; i < _test1.Count; i++)
        _hash1 += _test1[i] * 17 + (i * 137);
    for (int i = 0; i < _test2.Count; i++)
        _hash2 += _test2[i] * 17 + (i * 137);

My output gives:

_hash1 = 462
_hash2 = 462

So this is clearly not working, does any one know a better solution ? I am not expert on this topic.

Thanks

WDUK
  • 1,412
  • 1
  • 12
  • 29
  • I checked those most of them only care if they all have the same numbers but not the same order. – WDUK Apr 18 '22 at 07:33
  • I would suggest to make a unique number with the combination of the index and the value on that index (like `_hash1 += _test1[i] * i * 17`), now you are seeing the value of each index and the index as 2 separate values that are not linked (only added) and thus don't change the hash in a meaningful way. – Christophe Devos Apr 18 '22 at 07:40
  • 1
    @WDUK I've tested [this answer](https://stackoverflow.com/a/3404820/3181933) and [this answer](https://stackoverflow.com/a/3405330/3181933) and both give different hashes for 0,1,2 vs 2,1,0. Can you provide an example set of numbers where you've found this isn't true? – ProgrammingLlama Apr 18 '22 at 07:43
  • 1
    *So this is clearly not working* - it's because you're essentially doing 0+1+2 and wondering why it's the same as doing 0+2+1 – Caius Jard Apr 18 '22 at 07:47
  • @DiplomacyNotWar the first one mentions they found collisions and the solution is to use equals, which i don't see what the equals method does to solve the collision since it means you have to iterate the list in the equals method which defeats the point of a hash optimisation, the second answer the range of ints is too small. – WDUK Apr 18 '22 at 07:49
  • You'll always find collisions when you try and pigeonhole things. Simplification: you can't put three values into one of two boxes and not expect collisions. For partitioning (as used by `HashSet`, `Dictionary` keys, etc.) the ideal is that the hashcode will be relatively collision free, and only if there is a collision do you then have to check equality. And even though you would have to iterate the list in the equals method, you only have to iterate so far as to determine they are not the same, and you only have to do that if their lengths are the same – ProgrammingLlama Apr 18 '22 at 07:53
  • If you can tell us specifically what you're using this for, we might be able to come up with a better solution. – ProgrammingLlama Apr 18 '22 at 07:54
  • @DiplomacyNotWar i need to check if my list of integers already exists in a list containing lists of integers, which is expensive so I want to create a unique hash code and put the lists in a hash set and just against that instead for a O(1) check. – WDUK Apr 18 '22 at 07:55
  • 1
    Note that HashSet, Dictionary, etc. don't guarantee O(1) checks, this is because collisions can happen. Whatever routine you use to generate the hash codes should make a good effort to prevent collisions, but then `Equals` does the final check. And `HashSet` will check `Equals` even if there isn't a collision. You can see that [here](https://github.com/microsoft/referencesource/blob/master/System.Core/System/Collections/Generic/HashSet.cs#L264). – ProgrammingLlama Apr 18 '22 at 07:59
  • I see so the goal is to make a hash algorithm that reduces the amount of collision chances - not actually guarantee unique hashes? – WDUK Apr 18 '22 at 08:09
  • That's exactly it, yes. The hope is that in most cases the hash code will be unique, so you only have to check equality on a single item. If you refer to [this chart](https://github.com/RehanSaeed/.NET-Big-O-Algorithm-Complexity-Cheat-Sheet/blob/main/Cheat%20Sheet.png) you can see that, at best, `HashSet` will have a complexity of O(1) but it can be as bad as O(n). – ProgrammingLlama Apr 18 '22 at 08:11

0 Answers0