131

If I want to use objects as the keys for a Dictionary, what methods will I need to override to make them compare in a specific way?

Say I have a a class which has properties:

class Foo {
    public string Name { get; set; }
    public int FooID { get; set; }

    // elided
} 

And I want to create a:

Dictionary<Foo, List<Stuff>>

I want Foo objects with the same FooID to be considered the same group. Which methods will I need to override in the Foo class?

To summarize: I want to categorize Stuff objects into lists, grouped by Foo objects. Stuff objects will have a FooID to link them to their category.

gman
  • 100,619
  • 31
  • 269
  • 393
Dana
  • 32,083
  • 17
  • 62
  • 73

5 Answers5

169

By default, the two important methods are GetHashCode() and Equals(). It is important that if two things are equal (Equals() returns true), that they have the same hash-code. For example, you might "return FooID;" as the GetHashCode() if you want that as the match. You can also implement IEquatable<Foo>, but that is optional:

class Foo : IEquatable<Foo> {
    public string Name { get; set;}
    public int FooID {get; set;}

    public override int GetHashCode() {
        return FooID;
    }
    public override bool Equals(object obj) {
        return Equals(obj as Foo);
    }
    public bool Equals(Foo obj) {
        return obj != null && obj.FooID == this.FooID;
    }
}

Finally, another alternative is to provide an IEqualityComparer<T> to do the same.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • 5
    +1 and I don't mean to hijack this thread but I was under the impression that GetHashCode() should return FooId.GetHashCode(). Is this not the right pattern? – Ken Browning Mar 11 '09 at 15:48
  • 8
    @Ken - well, it just needs to return an int that provides the necessary features. Which FooID will do just as well as FooID.GetHashCode(). As an implementation detail, Int32.GetHashCode() is "return this;". For other types (string etc), then yes: .GetHashCode() would be very useful. – Marc Gravell Mar 11 '09 at 15:51
  • 2
    Thanks! I went with the IEqualityComparer as it was only for the Dicarionary that I needed the overriden methods. – Dana Mar 11 '09 at 16:07
  • 1
    You should be aware that the performance of containers based on hashtables (Dictionary, Dictionary, HashTable, etc.) depends on the quality of the hash function used. If you simply FooID as the hash code, the containers might perform very poorly. – Jørgen Fogh Sep 05 '14 at 11:08
  • 2
    @JørgenFogh I am very aware of that; the example presented is consistent with the stated intent. There's a lot of related concerns related to hash immutability; ids change less often than names, and are *usually* reliably unique and indicators of equivalence. A non-trivial subject, though. – Marc Gravell Sep 05 '14 at 11:11
  • This method makes it possibile to create duplicate keys! Create Foo with ID, insert it, change ID & insert it again. Now you have duplicate keys! Is this intended? – Dave_cz Mar 08 '16 at 12:22
  • Almost missed the override, as I wasn't expected `Equals(obj)`. Btw, it is alright for the hashcode to be the same, as it is pretty difficult to generate unique ids all the time. When the hashcode is the same, it ends up relying on `Equals(obj)` instead. However, there might be performance issues, so keep the hashcode duplication to a minimum. – Cardin Jul 26 '16 at 01:12
  • Shouldn't `FooID` be a `readonly` property, since it's acting as a hash? – user666412 Jun 05 '18 at 22:17
  • @user666412 yes, it should – Marc Gravell Jun 06 '18 at 10:16
34

As you want the FooID to be the identifier for the group, you should use that as key in the dictionary instead of the Foo object:

Dictionary<int, List<Stuff>>

If you would use the Foo object as key, you would just implement the GetHashCode and Equals method to only consider the FooID property. The Name property would just be dead weight as far as the Dictionary was concerned, so you would just use Foo as a wrapper for an int.

Therefore it's better to use the FooID value directly, and then you don't have to implement anything as the Dictionary already supports using an int as a key.

Edit:
If you want to use the Foo class as key anyway, the IEqualityComparer<Foo> is easy to implement:

public class FooEqualityComparer : IEqualityComparer<Foo> {
   public int GetHashCode(Foo foo) { return foo.FooID.GetHashCode(); }
   public bool Equals(Foo foo1, Foo foo2) { return foo1.FooID == foo2.FooID; }
}

Usage:

Dictionary<Foo, List<Stuff>> dict = new Dictionary<Foo, List<Stuff>>(new FooEqualityComparer());
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • 1
    More correctly, int already supports the methods/interfaces required for it to be used as a key. Dictionary has no direct knowledge of int or any other type. – Jim Mischel Mar 11 '09 at 15:58
  • I thought about that, but for a variety of reasons it was cleaner and more convenient to use the objects as the dictionary keys. – Dana Mar 11 '09 at 16:08
  • 1
    Well, it only looks like you are using the object as key, as you are really only using the id as key. – Guffa Mar 11 '09 at 17:21
9

For Foo you will need to override object.GetHashCode() and object.Equals()

The dictionary will call GetHashCode() to calculate a hash bucket for each value and Equals to compare whether two Foo's are identical.

Make sure to calculate good hash codes (avoid many equal Foo objects having the same hashcode), but make sure two equals Foos have the same hash code. You might want to start with the Equals-Method and then (in GetHashCode()) xor the hash code of every member you compare in Equals.

public class Foo { 
     public string A;
     public string B;

     override bool Equals(object other) {
          var otherFoo = other as Foo;
          if (otherFoo == null)
             return false;
          return A==otherFoo.A && B ==otherFoo.B;
     }

     override int GetHashCode() {
          return 17 * A.GetHashCode() + B.GetHashCode();
     }
}
jball
  • 24,791
  • 9
  • 70
  • 92
froh42
  • 5,190
  • 6
  • 30
  • 42
  • 2
    Aside - but xor (^) makes a poor combinator for hash-codes, as it often leads to a lot of diagonal collisions (i.e. {"foo","bar"} vs {"bar","foo"}. A better choice is to multiply and add each term - i.e. 17 * a.GetHashCode() + B.GetHashCode(); – Marc Gravell Mar 11 '09 at 15:02
  • 2
    Marc, I see what you mean. But how do you get at the magic number 17? Is it advantageous to use a prime number as a multiplicator for combining hashes? If so, why? – froh42 Mar 11 '09 at 20:59
  • May I suggest returning: (A + B).GetHashCode() rather than: 17 * A.GetHashCode() + B.GetHashCode() This will: 1) Be less likely to have a collision and 2) ensure that there is no integer overflow. – Charles Burns Jul 06 '11 at 22:53
  • (A + B).GetHashCode() makes for a very bad hashing algorithm, as different sets of (A, B) can result in the same hash if they are concatenate to the same string; "hellow" + "ned" is the same as "hell" + "owned" and would result in the same hash. – kaesve Jan 23 '14 at 23:54
  • @kaesve how about (A+" "+B).GetHashCode() ? – Timeless May 26 '14 at 05:59
  • try A.GetHashCode() ^ B.GetHashCode() – gillonba Mar 10 '16 at 02:13
  • @rotard when you read above that was my original assumption as well, but it has been suggested 17 * a.getHashCode() + b.getHashCode() is better to prevent diagonal collisions. See other comments. – froh42 Mar 16 '16 at 10:25
1

What about Hashtable class!

Hashtable oMyDic = new Hashtable();
Object oAnyKeyObject = null;
Object oAnyValueObject = null;
oMyDic.Add(oAnyKeyObject, oAnyValueObject);
foreach (DictionaryEntry de in oMyDic)
{
   // Do your job
}

In above way, you can use any object (your class object) as a generic Dictionary key :)

Behzad Ebrahimi
  • 992
  • 1
  • 16
  • 28
1

I had the same problem. I can now use any object I've tried as a key due to overriding Equals and GetHashCode.

Here is a class that I built with methods to use inside of the overrides of Equals(object obj) and GetHashCode(). I decided to use generics and a hashing algorithm that should be able to cover most objects. Please let me know if you see anything here that doesn't work for some types of object and you have a way to improve it.

public class Equality<T>
{
    public int GetHashCode(T classInstance)
    {
        List<FieldInfo> fields = GetFields();

        unchecked
        {
            int hash = 17;

            foreach (FieldInfo field in fields)
            {
                hash = hash * 397 + field.GetValue(classInstance).GetHashCode();
            }
            return hash;
        }
    }

    public bool Equals(T classInstance, object obj)
    {
        if (ReferenceEquals(null, obj))
        {
            return false;
        }
        if (ReferenceEquals(this, obj))
        {
            return true;
        }
        if (classInstance.GetType() != obj.GetType())
        {
            return false;
        }

        return Equals(classInstance, (T)obj);
    }

    private bool Equals(T classInstance, T otherInstance)
    {
        List<FieldInfo> fields = GetFields();

        foreach (var field in fields)
        {
            if (!field.GetValue(classInstance).Equals(field.GetValue(otherInstance)))
            {
                return false;
            }
        }

        return true;
    }

    private List<FieldInfo> GetFields()
    {
        Type myType = typeof(T);

        List<FieldInfo> fields = myType.GetTypeInfo().DeclaredFields.ToList();
        return fields;
    }
}

Here is how it's used in a class:

public override bool Equals(object obj)
{
    return new Equality<ClassName>().Equals(this, obj);
}

public override int GetHashCode()
{
    unchecked
    {
        return new Equality<ClassName>().GetHashCode(this);
    }
}
Keith Banner
  • 602
  • 1
  • 10
  • 15