0

I have a class Person for which I have to override the Equals and GetHashCode method. Two person objects are equals if the Name matches OR if the Email matches. What's a good way of doing this with a considerably efficient hash function?

class Person
{
    string Name
    string Email

    public override Equals(object obj)
    {
        if (ReferenceEquals(obj, null))
            return false;
        if (ReferenceEquals(this, obj))
            return true;
        if (obj is Person)
        {
            Person person = (Person)obj;
            return
                (this.Name == person.Name)
                || (this.Email == person.Email);
        }
        return false;
    }

    public override GetHashCode()
    {
        // What's a good way to implement?
    }
}
Sheena
  • 15,590
  • 14
  • 75
  • 113
hIpPy
  • 4,649
  • 6
  • 51
  • 65
  • possible duplicate of [What is the best algorithm for an overridden System.Object.GetHashCode?](http://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode) – Gonzalo Nov 08 '10 at 22:13
  • 1
    @Gonzalo: It's absolutely *not* a duplicate of that. It's a very different situation. – Jon Skeet Nov 08 '10 at 22:17

4 Answers4

10

You can't, really. Well, not apart from returning a constant value.

Look at it this way... all people with email "x" have to have the same hash code, because they're equal. And all people with name "y" have to have the same hash code, and so it goes on:

Name    Email    Hash
  n1       e1      h1
  n2       e1      h1 (because emails are equal
  n2       e2      h1 (because names are equal to previous)

Note how we've managed to change both the name and the email to arbitrary values, but the hash has to still be h1.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
8

I know that this does not answer your question, but your approach is incorrect. It is expected that if a == b, and b == c, it necessarily follows that a == c.

Person a:
    name: mike
    email: someone@website.com

Person b:
    name: steve
    email: someone@website.com

Person c:
    name: steve
    email: steve@website.com

In this example a == b, and b == c, but a != c. This is incorrect behavior. If you want to implement this behavior, it is perfectly fine to have a method other that Equals that does this comparison, but not equals.

See http://msdn.microsoft.com/en-us/library/ms173147%28VS.80%29.aspx.

snarf
  • 2,684
  • 1
  • 23
  • 26
0

Like Alex said, this is more of a business rule related thing and I wouldn't use Equals for this purpose. I'd have another method that has the implementation you have in the Equals method.

Of course, Alex mentions a hash of Name+email but that won't work for you either since Jon pointed out, it's not really something you can do given the business rules you have.

Jackie Kirby
  • 1,137
  • 1
  • 9
  • 14
-4

There is a way in which you can do what you're trying to do.

Let's say you have an Enum that you've defined like so

public enum MatchedOn { None, Name, Email }

Next, pull out the implementation of your Equals method into another method such that you call it from your Equals method. In this new method, set the enum to be Name if the names are equal or Email if the emails are equal or None if neither is the same.

Then in your GetHashCode implementation you can call this new method as well and then return a hashed code based on Name or Email or the combination of both.

I hope that makes sense.

Shiv Kumar
  • 9,599
  • 2
  • 36
  • 38
  • -1: This does not make sense, and will not work in [Jon Skeet's example](http://stackoverflow.com/questions/4128584/gethashcode-equals-implmentation-for-a-class-in-c/4128616#4128616) unless `GetHashCode` is allowed to vary between calls, which is must not. – Brian Nov 09 '10 at 18:51
  • @Brian, There are many rules to implementing GetHashCode(). If you and the OP really want to know here they are:1. If two objects are equal as defined by the operator== they MUST generate the same hash code. 2. GetHashCode() must be instance invariant. 3. The Hash function should generate a random distribution across the range of integers. – Shiv Kumar Nov 09 '10 at 22:48
  • Now you don't have to (and should not) implement GetHashcode unless you intend to use your type as a key in a Dictionary or a HashSet. But seeing that the OP didn't mention this need nor care about the other rules, I don't see why you're hung up on "must not change" when there is no mention of his customer type being immutable. – Shiv Kumar Nov 09 '10 at 22:48
  • @Shiv: My explanation was a bit imprecise, as are the rules you are listing. My point about "must not change" was that the HashCode should not change unless the return of the `Equals` function will return, which **is** required ( [See MSDN](http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx) ). Your scheme will break if you don't allow calling the `Equals` method to change the hashcode. – Brian Nov 09 '10 at 23:02
  • Though maybe I'm misunderstanding your proposal. Maybe you're proposing calling the `Equals` method on every pair of `Person` instances in order to generate the hash code. That's still completely broken, though. – Brian Nov 09 '10 at 23:04
  • @Brian, let's face it, the OP use of or need to implement GetHashCode is not very clear so anything anyone suggests as a solution is full of assumptions. The rules I've listed are not imprecise (I don't think). The hash code *should not* change after an instance has been constructed. Any other behavior is wrong. However, given the OP's needs one has to break this rule and others. So from that point anything goes, right? You can't say some rules still apply and others don't. So I was trying to go with the flow and provide a possible solution. It may not be correct. – Shiv Kumar Nov 10 '10 at 03:13
  • FYI: There is no connection between Equals and GetHashCode per-se. Equality is determined using operator==. But if the OP has no intent on using his objects as a key in a Dictionary or HashSet there is no need to implement GetHashCode anyway and in fact he should not. Bill Wagner talks about this in his book Effective C# if I remember correctly or is it Jeff Richter? – Shiv Kumar Nov 10 '10 at 03:22
  • @Shiv: You are incorrect; there is a connection to the equals method. From the MSDN article: "The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's **Equals** method. Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again." – Brian Nov 10 '10 at 06:21
  • As I said, there is no connection per-se :). Brian you have to understand when each is used and also that equality is not determined by the Equals method but rather the operator==. When I said no connection per se what I mean is that GetHashCode does not call Equals. When adding items to a dictionary only GetHashCode is called, Equals is not. The documentation is talking about the whole kit-n-kaboodle which implies correct implementations. This particular use of Equals (by the OP) is not the "correct" use as I think we've established. We're just going round and round so I'll beg off from this. – Shiv Kumar Nov 10 '10 at 06:29