0

There is this excelent question and answer about this topic: Do I HAVE to override GetHashCode and Equals in new Classes?

As it mentions:

you only need to override them if you need value equality semantics. The System.Object implementation isn't 'bad', it just only does a reference check (which is all an implementation at that level can do).

In short: If you need some sort of value based equality (equality based on properties of the class), then yes, override away. Otherwise, it should be more than fine already.

Let's suppose I have a class User:

public class User: IEquatable<User>
{
    private readonly string _firstName;
    private readonly string _lastName;
    private readonly string _address;

    public User (string firstName, string lastName, string address)
    {       
        this._firstName = firstName;
        this._lastName = lastName;
        this._address = address;
    }

    public FirstName {get; private set;}
    public LastName {get; private set;}
    public Address {get; private set;}


    //should I need to override this?
    public override bool Equals(object right)
    {
        if (object.ReferenceEquals(right, null))
            return false;

        if (object.ReferenceEquals(this, right))
            return true;

        if (this.GetType() != right.GetType())
            return false;

        return this.Equals(right as User);
    }

    #region IEquatable<User> Members
    public bool Equals(User user)
    {
        bool isEqual = (this._firstName != null && this._firstName.Equals(user.FirstName, StringComparison.InvariantCultureIgnoreCase)) || 
                      (this._lastName != null && this._lastName.Equals(user.LastName, StringComparison.InvariantCultureIgnoreCase)) ||
                      (this._address != null && this._address.Equals(user.Address, StringComparison.InvariantCultureIgnoreCase)) ||
                      (this._firstName == null && this._lastName == null && this._address == null);
        return isEqual; 
    }
    #endregion

}

User user1 = new User("John", "Wayne", "Collins Avenue");
User user2 = new User("John", "Wayne", "Collins Avenue");

//if I don't override these methods, reference equals will be:
user1 == user2 // false

//if I override GetHashCode and Equals methods, then:
user1 == user2 //true

IList<User> usersList1 = new List<User>();
IList<User> usersList2 = new List<User>();

usersList1.Add(user1);
usersList2.Add(user2);

IList<User> finalUsersList = usersList1.Union(usersList2);

//if I don't override these methods, count will be:
finalUsersList.Count() // 2
//if I do override these methods, count will be:
finalUsersList.Count() // 1 

Is it right?

  1. The first Equals override method commented is required?
  2. In this case, which class members should I include in the GetHashCode override? All the members involved in the Equals method?

    public override int GetHashCode()
    {
        unchecked
        {
            // Hash -> primes
            int hash = 17;
    
            hash = hash * 23 + FirstName.GetHashCode();
            hash = hash * 23 + LastName.GetHashCode();
            hash = hash * 23 + Address.GetHashCode();
            return hash;
        }
    }
    

What happens if I only use FirstName for example?

Community
  • 1
  • 1
Alberto Montellano
  • 5,886
  • 7
  • 37
  • 53
  • 2
    well, when are the persons equal to you? – default Feb 09 '15 at 16:33
  • if firstname, lastname and address are equal. – Alberto Montellano Feb 09 '15 at 16:34
  • 1
    @AlbertoMontellano Then that's your answer. – Servy Feb 09 '15 at 16:35
  • 2
    You could define `GetHashCode()` to return a constant and it would still work -- but it wouldn't be very efficient. The hash is used to quickly establish whether two objects are *potentially* the same without running the full `Equals` method to find out. The more unique your hash is for the set of attributes you consider "equal", the better. For example, if you put your objects in a dictionary, it's the hash that determines how they get spread out in internal buckets. If most of your (non-equal) objects return a unique hash code, there'll be a low collision rate, and a fast lookup. – Cameron Feb 09 '15 at 16:36
  • @Cameron you're right with your answer. – Alberto Montellano Feb 09 '15 at 16:47

2 Answers2

2

The first Equals override method commented is required?

Some comparisons use the generic version, some use the non-generic version. Since it's a fairly trivial implementation if you already have the generic version there's no harm in implementing it.

In this case, which class members should I include in the GetHashCode override? All the members involved in the Equals method?

The only requirement for GetHashCode is that two object that are "equal" must return the same hash code (the reverse is not true - two equal hash codes does not imply equal objects).

So your GetHashCode can do anything from returning a constant (punt and use Equals to determine equality) to an elaborate function that returns as many distinct hash codes as possible.

For reasonable performance when using hash-based collections, you should design your GetHashCode logic to minimize collisions. This is typically done by iteratively multiplying the hash code by a prime number as you are doing.

The other key is that hash codes cannot change, which means that the values that derive the hash code cannot change. If a hash code changed over the life of an object, it would be impossible to find the item in a dictionary, since it stores item based on their hash value.

If you want to define "equality" based on a value that can change, you would be better served doing that in a separate class that implements IEqualityComparer, with the caveat that the objects should not be modified if they are to be used to do hash-based lookups.

What happens if I only use FirstName for example?

You may get more collisions than if you used all relevant fields, which just means that the system has to do more work when looking up an item in a hashed-based collection. It first finds all objects with the computed hash, then checks then against the original object using Equals.

Note that in your implementation you should do a null-check of FirstName, LastName, and Address :

    hash = hash * 23 + (FirstName == null ? 0 : FirstName.GetHashCode());
    hash = hash * 23 + (LastName  == null ? 0 : LastName.GetHashCode());
    hash = hash * 23 + (Address   == null ? 0 : Address.GetHashCode());
D Stanley
  • 149,601
  • 11
  • 178
  • 240
0

This will depend on how you intend to compare your users. If you want equality comparisons to return true only when you are comparing two references to the same user object, then your equals override is not needed.

However, if you want to compare users based on some other logic, then the answer as to which fields you should use in your equals and GetHashCode implementations depends on your specific context. When doing equality comparisons would you consider two users equal if they have the same first name and last name? What about if they have the same first name and last name but not the same address? Whichever fields you think define a unique user are the ones that you should use.

Oren Hizkiya
  • 4,420
  • 2
  • 23
  • 33
  • in this case, ins't it controlled by the Equals interface method? why is required in GetHashCode? – Alberto Montellano Feb 09 '15 at 16:35
  • 1
    @AlbertoMontellano Look at the documentation for that method to see why it exist and what its used for. – Servy Feb 09 '15 at 16:36
  • 2
    @AlbertoMontellano I would refer you to this [SO Question](http://stackoverflow.com/questions/371328/why-is-it-important-to-override-gethashcode-when-equals-method-is-overridden) on why overriding GetHashCode is important – tyh Feb 09 '15 at 16:36
  • you're right guys, it seems the answer is it is important to override GetHashCode but it first checks GetHashCode and then the Equals method. – Alberto Montellano Feb 09 '15 at 16:49