2

I have build some complex objects and I am trying to verify it works correctly by doing some unit testing. This involves comparing some List(Of T), so I tried to use CollectionAssert. Now I encountered something weird.

First I used CollectionAssert.AreEqual to see if the first list was equal. This assertion passed. But for simplicity I wanted to use CollectionAssert.AreEqual, so that I do not have to create the expected object in the exact right order, so I started trying that. With exactly the same code, the CollectionAssert.AreEquivalent failed. I would say this is weird, because equivalent is a looser assertion than equal, right? I get this error:

CollectionAssert.AreEquivalent failed. The expected collection contains 1 occurrence(s) of <MyObject>. The actual collection contains 0 occurrence(s).

I tried debugging, but I didnt get it working to debug the .Net framework, even though I set up downloading the symbol files. So I could only see it enters my custom Equals function once - which returns true - and then the assertion fails. Both objects have two elements. The call stack is (in reversed order):

  • CollectionAssert.AreEquivalent
  • CollectionAssert.AreEquivalent(overload)
  • CollectionAssert.FindMisMatchedElement
  • Generic.Dictionary(Of Object, int).TryGetValue
  • Generic.Dictionary(Of Object, int).FindEntry
  • Generic.ObjectEqualityComparer.Equals
  • My custom Equals

Now that I am writing this, an idea came up and I see a potential problem. I see it internally uses a Dictionary. Which probably functions as some sort of hashmap, where the int is the index in the actual list? Does this mean I need to implement a custom IEqualityComparer, rather than overriding equals? And how should my getHashCode() look like then? (I'm guessing that this is crucial then, since I reckon it might be used for the key in the dictionary?)

Martao
  • 795
  • 2
  • 6
  • 12
  • 2
    Any chance you have overridden `Equals` in your class, but *not* `GetHashCode`? – sloth Apr 06 '13 at 09:24
  • This would be easier to answer if you show your code . – sloth Apr 06 '13 at 09:26
  • I understand, but in this case it's quite a bit of code. I indeed did not override GetHashCode, as I mention in the last paragraph. But what I don't quite understand yet, is how my GetHashCode should look like. – Martao Apr 06 '13 at 09:51
  • The class T basically contains 4 Strings and an Enum type. – Martao Apr 06 '13 at 09:55
  • What I dont quite understand is what it needs the GetHashCode for. As I wrote, I reckon it might be used for the key in that dictionary. BUT, hashcodes could have collisions, right? Also, there is some logic in my equals function, which is not directly comparing all Strings 1-to-1, so this means I would basically implement this functionality twice in order to assure equal objects get the same hashcode, right? This feels weird... – Martao Apr 06 '13 at 10:07

1 Answers1

2

You're on the right track: the problem is indeed that you don't override GetHashCode while you override Equals.

Here's an example to reproduce your issue:

void Main()
{
    var a = new []{new Broken{Foo="a"}, new Broken{Foo="b"}};
    var b = new []{new Broken{Foo="a"}, new Broken{Foo="b"}};

    CollectionAssert.AreEqual(a, b);
    CollectionAssert.AreEquivalent(a, b);
}

class Broken
{
    public string Foo {get;set;}

    public override bool Equals(object obj)
    {
        return Foo == ((Broken)obj).Foo;
    }
}

As you have correctly noted, CollectionAssert.AreEquivalent uses a Dictionary, and it's used to count how often each unique element are in the collections.

The problem is not that hashcodes can have collisions, but that if two elements that should be considered equal are actually never compared using Equals if their hashcode returned by GetHashCode is different.


You may also be interessted in this question:

Why is it important to override GetHashCode when Equals method is overridden?


Also, there is some logic in my equals function, which is not directly comparing all Strings 1-to-1, so this means I would basically implement this functionality twice in order to assure equal objects get the same hashcode, right?

Not necessarily. The performance of an Dictionary depends on the hashing algorithm (and the hash value of an object should also not change while it is used as a key).

If you can live with some performance penalties (which are probably negligible), you could use a simpler method to calculate the hash value than you use in your Equals method (and accept some more hash collisions). If the hash value of two objects are equal, Equals is called anyway. (In fact, you could get away with just returning the same value, e.g. 1, everytime).


The relevant section in the documentation:

Object.GetHashCode

Objects used as a key in a Hashtable object must also override the GetHashCode method because those objects must generate their own hash code

If two objects compare as equal, the GetHashCode method for each object must return the same value.However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.

Community
  • 1
  • 1
sloth
  • 99,095
  • 21
  • 171
  • 219
  • Okay, this is indeed alike what I thought about. However, what I dont quite understand how you could use the hash code as key for a dictionary in face of hash collisions. Doesnt this mean I could have two entries for the same key? (however slim that chance is) – Martao Apr 06 '13 at 11:27
  • Never mind, I just saw your point about the usage of the Dictionary. Thanks, that clears it up for real. – Martao Apr 06 '13 at 11:59