9

I find my self overriding Equals() and GetHashCode() frequently to implement the semantic that business objects with identical property values are equal. That leads to code that is repetitive to write and fragile to maintain (property gets added and one/both of the overrides are not updated).

The code ends up looking something like this (comments on the implementation are welcome):

public override bool Equals(object obj)
{
    if (object.ReferenceEquals(this, obj)) return true;

    MyDerived other = obj as MyDerived;

    if (other == null) return false;

    bool baseEquals = base.Equals((MyBase)other);
    return (baseEquals && 
        this.MyIntProp == other.MyIntProp && 
        this.MyStringProp == other.MyStringProp && 
        this.MyCollectionProp.IsEquivalentTo(other.MyCollectionProp) && // See http://stackoverflow.com/a/9658866/141172
        this.MyContainedClass.Equals(other.MyContainedClass));
}

public override int GetHashCode()
{
    int hashOfMyCollectionProp = 0;
    // http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/
    // BUT... is it worth the extra math given that elem.GetHashCode() should be well-distributed?
    int bitSpreader = 31; 
    foreach (var elem in MyCollectionProp)
    {
        hashOfMyCollectionProp = spreader * elem.GetHashCode();
        bitSpreader *= 31;
    }
    return base.GetHashCode() ^ // ^ is a good combiner IF the combined values are well distributed
        MyIntProp.GetHashCode() ^ 
        (MyStringProp == null ? 0 : MyStringProp.GetHashValue()) ^
        (MyContainedClass == null ? 0 : MyContainedClass.GetHashValue()) ^
        hashOfMyCollectionProp;
}

My Questions

  1. Is the implementation pattern sound?
  2. Is ^ adequate given that the contributing component values are well-distributed? Do I need to multiply by 31-to-the-N when combining collection elements given their hash is well distributed?
  3. It seems this code could be abstracted into code that uses reflection to determine public properties, builds an expression tree that matches the hand-coded solution, and executes the expression tree as needed. Does that approach seem reasonable? Is there an existing implementation somewhere?
Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • 2
    Why the downvote (over a year after posting)? The question is very legitimate. If there is something wrong with it, please say what. – Eric J. May 30 '13 at 17:38

3 Answers3

4

MSDN actually doesn't say "don't overload Equals et al for mutable types". It used to say that, but now it says:

When you define a class or struct, you decide whether it makes sense to create a custom definition of value equality (or equivalence) for the type. Typically, you implement value equality when objects of the type are expected to be added to a collection of some sort, or when their primary purpose is to store a set of fields or properties.

http://msdn.microsoft.com/en-us/library/dd183755.aspx

Still, there are complexities surrounding stability of the hash code while an object participates in a hashed collection (Dictionary<T,U>, HashSet<T>, etc.).

I decided to opt for the best of both worlds, as outlined here:

https://stackoverflow.com/a/9752155/141172

Community
  • 1
  • 1
Eric J.
  • 147,927
  • 63
  • 340
  • 553
1

I find my self overriding Equals() and GetHashCode() frequently

  • MSDN says : don't overload Equals et al for mutable types

Is ^ adequate given that the contributing component values are well-distributed?

  • Yes, but hey are not always well distributed. Consider int properties. Shifting with some (small) prime numbers is advised.
H H
  • 263,252
  • 30
  • 330
  • 514
  • Where does MSDN say that? I tried googling the text literally and only found this question. – Eric J. Mar 14 '12 at 18:53
  • Also... to take a simple example of where I see utility in overriding Equals (but maybe I'm missing something), how would you implement a unit test that does not override Equals given `MyDerived expected = new MyDerived() { /* Initialization */ }; MyDerived actual = DoSomeTest(); Assert.AreEqual(expected, actual)`? – Eric J. Mar 14 '12 at 18:57
  • And regarding ^... in my case, each XORed value is a result of GetHashCode(), which should be reasonably distributed. As implemented, do you see flaws in how I use ^? Would you multiply collection elements by a prime even if each collection element's hash is a result of GetHashCode()? – Eric J. Mar 14 '12 at 18:59
  • It used to be spelled out on [this page](http://msdn.microsoft.com/en-us/library/7h9bszxx%28v=vs.100%29.aspx) but now only in the footnote (someone quoted the old text). You can still distill it from the various advices about value/reference types. – H H Mar 14 '12 at 19:07
  • About the `^` : How many `int` properties do you have? They are their own HashCode and tend to be _not_ well distributed. Search this site. – H H Mar 14 '12 at 19:09
  • Ah... didn't realize until you pointed it out that an int's hash code is just the int itself. I don't see why `Overriding operator == in non-immutable types is not recommended` is universally true (maybe that's why MSDN removed it from the official documentation?) That sounds more like a separate question, though, so I'll pose it as one shortly. – Eric J. Mar 14 '12 at 19:29
  • The better question: Do your objects have _identity_. If so then reference-equals is the perfect implementation. – H H Mar 14 '12 at 19:42
  • Just came across this... seems MSDN has changed guidance http://msdn.microsoft.com/en-us/library/dd183755.aspx – Eric J. Mar 14 '12 at 20:06
  • @HenkHolterman: It's too bad Microsoft never defined a standard convention for mutable objects to allow comparisons of their *state*. IMHO, `Equals` overloads should define a strict equivalence, such that visibly mutable objects are never equivalent to anything but themselves, but objects which hold references to instances which are of mutable types but *will never be mutated* may be equivalent if they hold references to objects with matching state. – supercat May 15 '13 at 22:31
0

Perhaps I'm confused here, but shouldn't the on null check return a 1 instead of a 0 in the GetHashCode override?

So

MyStringProp == null ? 0 : MyStringProp.GetHashValue()

should be

MyStringProp == null ? 1 : MyStringProp.GetHashValue()
Serj Sagan
  • 28,927
  • 17
  • 154
  • 183