overide GetHashCode when input is two Int16

Question

Both User.ID and Group.ID are Int16 and immutable and I want to generate an optimal HashCode.

This is the equality

    public override bool Equals(Object obj)
    {
        //Check for null and compare run-time types.
        if (obj == null || GetType() != obj.GetType()) return false;
        UserGroup item = (UserGroup)obj;
        return (User.ID == item.User.ID && Group.ID == item.Group.ID);
    }

What would be an optimal GetHashCode. Right now I am using the following but only because I saw it as an example. The primary use of the Object is in a HashSet and that HashSet gets a lot of .Select(x => x.User.ID = y) or .Select(x => x.Group.ID = y).

  public override int GetHashCode() { return (int)User.ID ^ (int)Group.ID; }

Instead of `GetType() != obj.GetType()`, please write `!(obj is UserGroup)`. — Timwi, Apr 01 '12 at 16:25

Hans Passant · Accepted Answer · 2012-04-01T17:41:21.947

1

Never skip the opportunity to generate a perfect hash. Two 16-bit shorts fit in a 32-bit int:

    public override int GetHashCode() {
        return (User.ID & 0xffff) + (Group.ID << 16);
    }

The first part of the expression isolates the lower 16 bits, the & operator is necessary to make it work properly for negative values so only bits 0 through 15 are used. The second part moves the 16 bits of Group.ID to bit positions 16 through 31. Adding them combines the two sets of 16 bits to make a 32-bit number.

edited Apr 01 '12 at 17:41

answered Apr 01 '12 at 16:03

Hans Passant

922,412
146
1,693
2,536

Don't fully understand this answer either but in testing with several values it did yield unique results. Those same values using XOR did not yield unique results so I is clearly a better hash. I posted another question an packing two 32 into a 64 and then extracting if you want to pick up a quick answer. – paparazzo Apr 01 '12 at 17:21
It's not just a better hash, it is a *perfect* hash :) – Hans Passant Apr 01 '12 at 17:27

score 0 · Answer 2 · edited May 23 '17 at 12:11

0

Well, actually neither is preferred if you want good hash distribution (just one illustration: with your implementation, User.ID = 10, Group.ID = 5 will hash to the same value as User.ID = 5, Group.ID = 10). It's usable, of course, but take a look at what the master had to say about this.

edited May 23 '17 at 12:11

Community

1
1

answered Apr 01 '12 at 15:43

Alan

6,501
1
28
24

I saw that and trust Keets as the right answer but to be honest I did not understand it. – paparazzo Apr 01 '12 at 17:21
Well, the (vastly simplified) point is that you want to avoid different inputs to generate the same hash code (a "hash collision"), because then the performance of any hash-based algorithm or class will begin to degenerate from O(1) towards O(n). Hans quite nailed it with his answer, because in your use case it is possible to generate a hash code that is guaranteed to be unique for each unique input combination (it's a "perfect hash"): you have two 16-bit integers and the hash code is 32 bits, so it can accommodate all possible combinations of your inputs. – Alan Apr 02 '12 at 07:32

overide GetHashCode when input is two Int16

2 Answers2