3

Firstly, I'm using the GetHashCode algorithm described, here. Now, picture the following (contrived) example:

class Foo
{
    public Foo(int intValue, double doubleValue)
    {
        this.IntValue = intValue;
        this.DoubleValue = doubleValue;
    }

    public int IntValue { get; private set; }
    public double DoubleValue { get; private set; }

    public override int GetHashCode()
    {
        unchecked
        {
            int hash = 17;

            hash = hash * 23 + IntValue.GetHashCode();
            hash = hash * 23 + DoubleValue.GetHashCode();
            return hash;
        }

    }
}

class DerivedFoo : Foo
{
    public DerivedFoo(int intValue, double doubleValue)
       : base(intValue, doubleValue)
    {

    }
}

If I have a Foo and a DerivedFoo with the same values for each of the properties they're going to have the same hash code. Which means I could have HashSet<Foo> or use the Distinct method in Linq and the two instances would be treated as if they were the same.

I'm probably just misunderstanding the use of GetHashCode but I would expect these the two instances to have different hash codes. Is this an invalid expectation or should GetHashCode use the type in the calculation? (Or should DerivedClass also override GetHashCode)?

P.S. I realize there are many, many questions on SO relating to this topic, but I've haven't spotted one that directly answers this question.

Community
  • 1
  • 1
RichK
  • 11,318
  • 6
  • 35
  • 49

2 Answers2

6

GetHashCode() is not supposed to guarantee uniqueness (though it helps for performance if as unique as possible).

The main rule with GetHashCode() is that equivalent objects must have the same hash code, but that doesn't mean non-equivalent objects can't have the same hash code.

If two objects have the same hash code, the Equals() method is then invoked to see if they are the same. Since the types are different (depending on how you coded your Equals overload of course) they will not be equal and thus it will be fine.

Even if you had a different hash code algorithm for each type, there's still always a chance of a collision, thus the need for the Equals() check as well.

Now given your example above, you do not implement Equals() this will make every object distinct regardless of the hash code because the default implementation of Equals() from object is a reference equality check.

If you haven't, go ahead and override Equals() for each of your types as well (they can inherit your implementation of GetHashCode() if you like, or have new ones) and there you can make sure that the type of the compare-to object are the same before declaring them equal. And make sure Equals() and GetHashCode() are always implemented so that:

  • Objects that are Equals() must have same GetHashCode() results.
  • Objects with different GetHashCode() must not be Equals().
James Michael Hare
  • 37,767
  • 9
  • 73
  • 83
  • thanks for the info, I didn't realize a Equals check was made if there are hash collisions. Do you know where I might find some documentation to support this? – RichK Sep 08 '11 at 14:47
  • 1
    The Eric Lippert blog post @Brent mentioned is a great start: http://blogs.msdn.com/b/ericlippert/archive/2011/02/28/guidelines-and-rules-for-gethashcode.aspx – James Michael Hare Sep 08 '11 at 14:57
  • The main thing is that a) Equals() should make sure both instances are the same type, b) Equals() should always check AT LEAST the same fields used in the hash code (lest the hash code be different for equal objects, which is BAD). – James Michael Hare Sep 08 '11 at 14:58
1

The two instances do not need to have different hash codes. The results of GetHashCode are not assumed by the HashSet or other framework classes, because there can be collisions even within a type. GetHashCode is simply used to determine the location within the hash table to store the item. If there is a collision within the HashSet, it then falls back on the result of the Equals method to determine the unique match. This means that whever you implement GetHashCode, you should also implement Equals (and check that the types match). Similarly, whenever you implement Equals, you should also implement GetHashCode. See a good explanation by Eric Lippert here.

Brent M. Spell
  • 2,257
  • 22
  • 14