5

I've got a complex class in my C# project on which I want to be able to do equality tests. It is not a trivial class; it contains a variety of scalar properties as well as references to other objects and collections (e.g. IDictionary). For what it's worth, my class is sealed.

To enable a performance optimization elsewhere in my system (an optimization that avoids a costly network round-trip), I need to be able to compare instances of these objects to each other for equality – other than the built-in reference equality – and so I'm overriding the Object.Equals() instance method. However, now that I've done that, Visual Studio 2008's Code Analysis a.k.a. FxCop, which I keep enabled by default, is raising the following warning:

warning : CA2218 : Microsoft.Usage : Since 'MySuperDuperClass' redefines Equals, it should also redefine GetHashCode.

I think I understand the rationale for this warning: If I am going to be using such objects as the key in a collection, the hash code is important. i.e. see this question. However, I am not going to be using these objects as the key in a collection. Ever.

Feeling justified to suppress the warning, I looked up code CA2218 in the MSDN documentation to get the full name of the warning so I could apply a SuppressMessage attribute to my class as follows:

    [SuppressMessage("Microsoft.Naming",
        "CA2218:OverrideGetHashCodeOnOverridingEquals",
        Justification="This class is not to be used as key in a hashtable.")]

However, while reading further, I noticed the following:

How to Fix Violations

To fix a violation of this rule, provide an implementation of GetHashCode. For a pair of objects of the same type, you must ensure that the implementation returns the same value if your implementation of Equals returns true for the pair.

When to Suppress Warnings

-----> Do not suppress a warning from this rule. [arrow & emphasis mine]

So, I'd like to know: Why shouldn't I suppress this warning as I was planning to? Doesn't my case warrant suppression? I don't want to code up an implementation of GetHashCode() for this object that will never get called, since my object will never be the key in a collection. If I wanted to be pedantic, instead of suppressing, would it be more reasonable for me to override GetHashCode() with an implementation that throws a NotImplementedException?


Update: I just looked this subject up again in Bill Wagner's good book Effective C#, and he states in "Item 10: Understand the Pitfalls of GetHashCode()":

If you're defining a type that won't ever be used as the key in a container, this won't matter. Types that represent window controls, web page controls, or database connections are unlikely to be used as keys in a collection. In those cases, do nothing. All reference types will have a hash code that is correct, even if it is very inefficient. [...] In most types that you create, the best approach is to avoid the existence of GetHashCode() entirely.

... that's where I originally got this idea that I need not be concerned about GetHashCode() always.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Chris W. Rea
  • 5,430
  • 41
  • 58

5 Answers5

14

If you are reallio-trulio absosmurfly positive that you'll never use the thing as a key to a hash table then your proposal is reasonable. Override GetHashCode; make it throw an exception.

Note that hash tables hide in unlikely places. Plenty of LINQ sequence operators use hash table implementations internally to speed things up. By rejecting the implementation of GetHashCode you are also rejecting being able to use your type in a variety of LINQ queries. I like to build algorithms that use memoization for speed increases; memoizers usually use hash tables. You are therefore also rejecting ability to memoize method calls that take your type as a parameter.

Alternatively, if you don't want to be that harsh: Override GetHashCode; make it always return zero. That meets the semantic requirements of GetHashCode; that two equal objects always have the same hash code. If it is ever used as a key in a dictionary performance is going to be terrible, but you can deal with that problem when it arises, which you claim it never will.

All that said: come on. You've probably spent more time typing up the question than it would take to correctly implement it. Just do it.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Good answer. And re: *"You've probably spent more time typing up the question than it would take to correctly implement it."* ... you're right *in this case* but I actually have a *series* of N objects that need something like Equals() to support my optimization. Trying to avoid N* the work. ;-) – Chris W. Rea Mar 25 '10 at 15:49
  • 4
    And thanks especially for pointing out that LINQ may imply usage of GetHashCode(). That is particularly enlightening. – Chris W. Rea Mar 25 '10 at 15:51
6

You should not suppress it. Look at how your equals method is implemented. I'm sure it compares one or more members on the class to determine equality. One of these members is oftentimes enough to distinguish one object from another, and therefore you could implement GetHashCode by returning membername.GetHashCode();.

Klaus Byskov Pedersen
  • 117,245
  • 29
  • 183
  • 222
  • It actually compares almost all of the members of the class to determine equality. The class contains assumptions for a series of calculations. Any one of them being different is sufficient for me to consider the objects different. Some of the assumptions are arrays or collections of other numbers. So GetHashCode would be necessarily as complex if it were implemented (correctly). – Chris W. Rea Mar 25 '10 at 14:50
  • 1
    @Chris: It doesn't have to be. A simple but often acceptable implementation is to XOR the GetHashCode values of the members. – Steven Sudit Mar 25 '10 at 15:16
  • 1
    An even simpler and acceptable (but not usually optimal) implementation is to return the GetHashCode() value of only one field. That's what happens for a struct. – Hans Passant Mar 25 '10 at 15:30
  • @klasbyskov, @Steven Sudit, @nobugz: Yes, I just realized my naivety: GetHashCode() doesn't *necessarily* need to factor in *all* of the members referred to by Equals() -- just enough of them to be "efficient" as a hash code. This was made obvious to me when Eric Lippert mentioned the simplest/worst case: *"if you don't want to be that harsh: Override GetHashCode; make it always return zero." – Chris W. Rea Mar 25 '10 at 16:01
  • I'm accepting this as the answer. The assist goes to Eric Lippert. – Chris W. Rea Mar 25 '10 at 16:04
5

My $0.10 worth? Implement GetHashCode.

As much as you say you'll never, ever need it, you may change your mind, or someone else may have other ideas on how to use the code. A working GetHashCode isn't hard to make, and guarantees that there won't be any problems in the future.

Steven Sudit
  • 19,391
  • 1
  • 51
  • 53
  • If I document my class with *"don't use this as a key in a collection"*, would it be fair game for me to throw a *NotImplementedException* for my class's GetHashCode()? What are the problems with that approach? – Chris W. Rea Mar 25 '10 at 14:59
  • You could, but what's the real benefit? A class with a special limitation that throws exceptions when tossed into a Dictionary is a liability. Implementing GetHashCode makes it fit right in with our reasonable expectations. – Steven Sudit Mar 25 '10 at 15:15
5

As soon as you forget, or another developer who isn't aware uses this, someone is going to have a painful bug to track down. I'd recommend simply implementing GetHashCode correctly and then you won't have to worry about it. Or just don't use Equals for your special equality comparison case.

ermau
  • 1,303
  • 7
  • 19
  • But, what about the idea in my last paragraph: implementing GetHashCode() to throw a NotImplementedException? It would be more difficult for a developer to mistakingly use it as a key - it would blow up (by design). – Chris W. Rea Mar 25 '10 at 14:54
  • 1
    While it may alleviate that mistake, it would be better to define the semantics now rather than later. – ermau Mar 25 '10 at 15:04
4

The GetHashCode and Equals methods work together to provide value-based equality semantics for your type - you ought to implement them together.

For more information on this topic please see these articles:

Shameless plug: These articles were written by me.

Andrew Hare
  • 344,730
  • 71
  • 640
  • 635
  • Does GetHashCode() ever get called by the framework, excluding when an object is used as a key in a collection? – Chris W. Rea Mar 25 '10 at 14:57
  • 3
    I am not sure but the important thing to remember is that at any point in the future it _could_ be called and since the current recommendation is that you implement the method it would be better to do so. – Andrew Hare Mar 25 '10 at 15:03