13

I have a case where I need to grab a bunch of items on distinct, but my source is a collection of objects with two properties, like this:

public class SkillRequirement
{
    public string Skill { get; set; }
    public string Requirement { get; set; }
}

I try to get a collection as follows:

SkillRequirementComparer sCom = new SkillRequirementComparer();

var distinct_list = source.Distinct(sCom);

I tried to implement an IEqualityComparer<T> for this, but I fell stumped at the GetHashCode() method.

The class for the Comparer:

public class SkillRequirementComparer : IEqualityComparer<SkillRequirement>
{
    public bool Equals(SkillRequirement x, SkillRequirement y)
    {
        if (x.Skill.Equals(y.Skill) && x.Requirement.Equals(y.Requirement))
        {
            return true;
        }
        else
        {
            return false;
        }
    }

    public int GetHashCode(SkillRequirement obj)
    {
        //?????
    }
}

Normally I would just use GetHashCode() on a property, but because I am comparing on two properties, I'm a bit at a loss of what to do. Am I doing anything wrong, or missing something really obvious?

Felix Weir
  • 459
  • 7
  • 18
  • Be careful when using a `GetHashCode()` that's derived from mutable fields! If you put the object in a hashing collection then change one of the fields - ouch. I suggest making it immutable. – Matthew Watson May 21 '13 at 10:49
  • The values for the properties are orginally from a database, where the columns do not allow nulls, there are also checks for null or empty strings before an object is added to the source list, so it is not a concern :^) – Felix Weir May 21 '13 at 10:53

2 Answers2

14

You can implement GetHashCode in the following way:

public int GetHashCode(SkillRequirement obj)
{
    unchecked
    {
        int hash = 17;
        hash = hash * 23 + obj.Skill.GetHashCode();
        hash = hash * 23 + obj.Requirement.GetHashCode();
        return hash;
    }
}

originally from J.Skeet

If the properties can be null you should avoid a NullReferenceException, e.g.:

int hash = 17;
hash = hash * 23 + (obj.Skill ?? "").GetHashCode();
hash = hash * 23 + (obj.Requirement ?? "").GetHashCode();
return hash;
Community
  • 1
  • 1
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • 1
    (Actually it's originally from Josh Bloch ;) – Matthew Watson May 21 '13 at 10:31
  • 1
    @TimSchmelter I guess he's worried about mutating the fields after the object's been placed in a hashing container. – Matthew Watson May 21 '13 at 10:50
  • Wow, thanks! Do you know how it works? I never even encountered the `unchecked` keyword before... :^) – Felix Weir May 21 '13 at 10:54
  • @FelixWeir: `GetHashCode` should be fast and it should not give too many collisions(equal hashcode for unequal objects). Note that `Equals` is called for every object with the same hashcode. So `Equals` must be implemented in a way that guarantees that it returns `true` for two equal objects according to the definition and vice-versa. `GetHashCode` should be implemented as efficient as possible and fairly accurate. Because checking for overflow takes time, the use of unchecked code in situations where there is no danger of overflow might improve performance. – Tim Schmelter May 21 '13 at 11:32
  • @FelixWeir: Note that it's ok if `GetHashCode` returns the same hashcode for two different objects. You should minimize those "false hashcodes" but there's nothing wrong with it. That's why `Equals` exist as additional check. However, two equal objects should always return the same hashcode. Read: [Guidelines and rules for GetHashCode](http://blogs.msdn.com/b/ericlippert/archive/2011/02/28/guidelines-and-rules-for-gethashcode.aspx). – Tim Schmelter May 21 '13 at 11:37
  • @TimSchmelter Distinct() implementation seems to use a hashing container. – Ufuk Hacıoğulları May 21 '13 at 12:19
1

I would like to link the following stack overflow posts too though the question is already answered..

GetHashCode -

Why is it important to override GetHashCode when Equals method is overridden?

Also, in the above answer Tim Schmelter says the properties can be null you should avoid a NullReferenceException

int hash = 17;
hash = hash * 23 + (obj.Skill ?? "").GetHashCode();
hash = hash * 23 + (obj.Requirement ?? "").GetHashCode();
return hash;

IEqualityComparer -

  1. What is the difference between using IEqualityComparer and Equals/GethashCode Override
  2. What's the role of GetHashCode in the IEqualityComparer in .NET?
  3. How and when to use IEqualityComparer in C#

IEquatable - What's the difference between IEquatable and just overriding Object.Equals()?

Equals - Guidelines for Overloading Equals()

class TwoDPoint : System.Object
{
    public readonly int x, y;

    public TwoDPoint(int x, int y)  //constructor
    {
        this.x = x;
        this.y = y;
    }

    public override bool Equals(System.Object obj)
    {
        // If parameter is null return false.
        if (obj == null)
        {
            return false;
        }

        // If parameter cannot be cast to Point return false.
        TwoDPoint p = obj as TwoDPoint;
        if ((System.Object)p == null)
        {
            return false;
        }

        // Return true if the fields match:
        return (x == p.x) && (y == p.y);
    }

    public bool Equals(TwoDPoint p)
    {
        // If parameter is null return false:
        if ((object)p == null)
        {
            return false;
        }

        // Return true if the fields match:
        return (x == p.x) && (y == p.y);
    }

    public override int GetHashCode()
    {
        //return x ^ y;
    }
}
Community
  • 1
  • 1
LCJ
  • 22,196
  • 67
  • 260
  • 418