0

By searching though msdn c# documentation and stack overflow, I get the clear impression that Dictionary<T,T> is supposed to use GetHashCode() for checking key-uniqueness and to do look-up.

The Dictionary generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table. ... The speed of retrieval depends on the quality of the hashing algorithm of the type specified for TKey.

I Use mono (in Unity3D), and after getting some weird results in my work, I conducted this experiment:

public class DictionaryTest
{
    public static void TestKeyUniqueness()
    {
    //Test a dictionary of type1
    Dictionary<KeyType1, string> dictionaryType1 = new Dictionary<KeyType1, string>();
    dictionaryType1[new KeyType1(1)] = "Val1";
    if(dictionaryType1.ContainsKey(new KeyType1(1)))
    {
        Debug.Log ("Key in dicType1 was already present"); //This line does NOT print
    }

    //Test a dictionary of type1
    Dictionary<KeyType2, string> dictionaryType2 = new Dictionary<KeyType2, string>();
    dictionaryType2[new KeyType2(1)] = "Val1";
    if(dictionaryType2.ContainsKey(new KeyType2(1)))
    {
        Debug.Log ("Key in dicType2 was already present"); // Only this line prints
    }
  }
}
//This type implements only GetHashCode()
public class KeyType1
{
    private int var1;

    public KeyType1(int v1)
    {
        var1 = v1;
    }

    public override int GetHashCode ()
    {
        return var1;
    }
}
//This type implements both GetHashCode() and Equals(obj), where  Equals uses the hashcode.
public class KeyType2
{
    private int var1;

    public KeyType2(int v1)
    {
        var1 = v1;
    }

    public override int GetHashCode ()
    {
        return var1;
    }

    public override bool Equals (object obj)
    {
        return GetHashCode() == obj.GetHashCode();
    }
}

Only the when using type KeyType2 are the keys considered equal. To me this demonstrates that Dictionary uses Equals(obj) - and not GetHashCode().

Can someone reproduce this, and help me interpret the meaning is? Is it an incorrect implementation in mono? Or have I misunderstood something.

sune
  • 171
  • 1
  • 9

2 Answers2

2

i get the clear impression that Dictionary is supposed to use .GetHashCode() for checking key-uniqueness

What made you think that? GetHashCode doesn't return unique values.

And MSDN clearly says:

Dictionary requires an equality implementation to determine whether keys are equal. You can specify an implementation of the IEqualityComparer generic interface by using a constructor that accepts a comparer parameter; if you do not specify an implementation, the default generic equality comparer EqualityComparer.Default is used. If type TKey implements the System.IEquatable generic interface, the default equality comparer uses that implementation.

ken2k
  • 48,145
  • 10
  • 116
  • 176
  • MSDN also states: "The Dictionary generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table. " I interpreted this as meaning that hashing is used for uniqueness and lookup. – sune Mar 01 '13 at 11:39
  • @sune It is used for lookup only. `GetHasCode`, as the name says, is only a hash that helps sorting items for quicker lookup. See http://stackoverflow.com/questions/7425142/what-is-hashcode-use-for-is-it-unique for `GetHashCode` usages. – ken2k Mar 01 '13 at 12:24
1

Doing this:

public override bool Equals (object obj)
{
    return GetHashCode() == obj.GetHashCode();
}

is wrong in the general case because you might end up with KeyType2 instances that are equal to StringBuilder, SomeOtherClass, AnythingYouCanImagine and what not instances.

You should totally do it like so:

public override bool Equals (object obj)
{
    if (obj is KeyType2) {
        return (obj as KeyType2).var1 == this.var1;
    } else
        return false;
}

When you are trying to override Equals and inherently GetHashCode you must ensure the following points (given the class MyObject) in this order (you were doing it the other way around):

1) When are 2 instances of MyObject equal ? Say you have:

public class MyObject {
     public string Name { get; set; }
     public string Address { get; set; }
     public int Age { get; set; }

     public DateTime TimeWhenIBroughtThisInstanceFromTheDatabase { get; set; }
}

And you have 1 record in some database that you need to be mapped to an instance of this class. And you make the convention that the time you read the record from the database will be stored in the TimeWhenIBroughtThisInstanceFromTheDatabase:

MyObject obj1 = DbHelper.ReadFromDatabase( ...some params...);
// you do that at 14:05 and thusly the TimeWhenIBroughtThisInstanceFromTheDatabase
// will be assigned accordingly

// later.. at 14:07 you read the same record into a different instance of MyClass
MyObject obj2 = DbHelper.ReadFromDatabase( ...some params...);
// (the same)

// At 14:09 you ask yourself if the 2 instances are the same
bool theyAre  = obj1.Equals(obj2)

Do you want the result to be true ? I would say you do. Therefore the overriding of Equals should like so:

public class MyObject {
    ...
    public override bool Equals(object obj) {
        if (obj is MyObject) {
            var that = obj as MyObject;
            return (this.Name == that.Name) && 
                   (this.Address == that.Address) &&
                   (this.Age == that.Age);
                   // without the syntactically possible but logically challenged:
                   // && (this.TimeWhenIBroughtThisInstanceFromTheDatabase == 
                   //     that.TimeWhenIBroughtThisInstanceFromTheDatabase)
        } else 
            return false;
    }
    ...
}

2) ENSURE THAT whenever 2 instances are equal (as indicated by the Equals method you implement) their GetHashCode results will be identitcal.

int hash1 = obj1.GetHashCode();
int hash2 = obj2.GetHashCode();
bool theseMustBeAlso = hash1 == hash2;

The easiest way to do that is (in the sample scenario):

   public class MyObject {
    ...
    public override int GetHashCode() {
       int result;
       result = ((this.Name != null) ? this.Name.GetHashCode() : 0) ^
                ((this.Address != null) ? this.Address.GetHashCode() : 0) ^
                this.Age.GetHashCode();
       // without the syntactically possible but logically challenged:
       // ^ this.TimeWhenIBroughtThisInstanceFromTheDatabase.GetHashCode()
    }
    ...
} 

Note that: - Strings can be null and that .GetHashCode() might fail with NullReferenceException. - I used ^ (XOR). You can use whatever you want as long as the golden rule (number 2) is respected. - x ^ 0 == x (for whatever x)

Eduard Dumitru
  • 3,242
  • 17
  • 31
  • Thank your for your answer. Clearly you your way of implementinng Equals and GetHashCode is more correct than mine. My code was just meant as an example to test whether Dictionary used GetHashCode(). Which it doesn't seem to. – sune Mar 01 '13 at 11:57