2

For example I have class

class Person
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

How to override GetHashCode in this class?

Grigor Aleksanyan
  • 540
  • 1
  • 6
  • 19
  • How are you defining equality here? `FirstName` and `LastName` match exactly? `FirstName` and `LastName` match after trimming? `FirstName` and `LastName` match in a locale-based comparison? Since your hash code is based on your equality definition, you need to define that equality definition first. – Jon Hanna Sep 07 '15 at 09:41
  • For example if (FirstName.Trim().ToLower() == other.FirstName.Trim().ToLower() && LastName.Trim().ToLower() == other.LastName.Trim().ToLower()) – Grigor Aleksanyan Sep 07 '15 at 17:23
  • Then you need a hash code that will be the same for different objects that are the the same by that rule. `(FirstName.Trim().ToLower + "\0" + LastName.Trim().ToLower()).GetHashCode()` for example would be a simple example, of reasonable quality though it requires allocations that you could skip with more involved approaches. – Jon Hanna Sep 07 '15 at 20:47

3 Answers3

1

You should base your object's hashcode calculation on immutable fields, if you can make Person's FirstName and LastName fields immutable, you can use the following pattern based on Josh Bloch's suggestion:

public override int GetHashCode()
{
    int hash = 17;
    hash = hash * 31 + FirstName.GetHashCode();
    hash = hash * 31 + LastName.GetHashCode();
    return hash;
}
Saeb Amini
  • 23,054
  • 9
  • 78
  • 76
  • ...And whenever `FirstName` is *null*, the implementation fails. – Dmitry Bychenko Sep 07 '15 at 08:47
  • ...Another problem is the *integer overflow* (`31 * hash` can well exceed `int.MaxValue`). To prevent it, surrond the computation with `unchecked {...}` – Dmitry Bychenko Sep 07 '15 at 08:52
  • @DmitryBychenko, correct on both points, but error checking in usually omitted for the sake of simplicity in small samples to get the main point across. – Saeb Amini Sep 07 '15 at 08:54
  • You don't need immutability for that. Immutability is a good way to make sure the object doesn't change while being used in a hash-based structure, but a mutable object will also work in a hash-based structure if you simply don't change it in the meantime. – Jon Hanna Sep 07 '15 at 09:39
  • @JonHanna _"Immutability is a good way to make sure the object doesn't change while being used in a hash-based structure"_, not sure what you mean by that, immutability means an object does not change _at all_, regardless of the context of use, and in the case of being used in calculating an object's hashcode for a hash-based structure, it makes sure the object _isn't lost_ . _If_ you are certain your mutable fields won't be changed by you, or anyone/anything else in the meantime, then yes, it would work, but that's a big assumption and depends on your situation. – Saeb Amini Sep 07 '15 at 10:10
  • Yes, and if you can be immutable, then immutability means you can be sure; but immutability has nothing to do with the actual question. Or indeed your answer, which would be a good answer to the question if you took out the bit about immutability, or at least made that further advice rather than suggesting it was directly related to the question. – Jon Hanna Sep 07 '15 at 10:54
  • @JonHanna you _want_ to be sure that the hash does not change almost all the time, and calculating it based on immutable fields is a great solution for that, so I think the two are related and still find my answer useful. Note that I didn't say _must_ in my suggestion, but you should do it because it lets you have one less thing to worry about when using the object in a hashset (not considering others who will use your object). Since your point is only valid under a big assumption, I find your argument moot, but if you'd like to continue arguing for it, I guess we'll have to agree to disagree. – Saeb Amini Sep 07 '15 at 12:51
  • Ok, we can do validation before getting hash codes. Are possible if FirstName and LastName not equal but the object's GetHashCode algorithm returns equals values? – Grigor Aleksanyan Sep 07 '15 at 17:32
  • Аlthough, after checking hashcodes usually we need override Equals method.. so, thats ok. – Grigor Aleksanyan Sep 07 '15 at 17:36
  • 1
    @GrigorAleksanyan, yes, it is possible for hashcodes to collide, `int` can have 2 ^ 32 distinct values which means 2 ^ 32 hashcodes for all strings, but the number of possible strings are limitless so eventually some collision might occur and you can't prevent that. However, the goal is to increase the likelihood of uniqueness and the above formula (multiplying by prime numbers) does a good job while being performant. – Saeb Amini Sep 07 '15 at 17:40
-1
public override int GetHashCode()
    {
        return String.Concat(FirstName, LastName).GetHashCode();
    }
Vitaly
  • 2,064
  • 19
  • 23
  • This is an extremely bad approach. `FirstName=ab;SecondName=cd` is *NOT* the same as `FirstName=a;SecondName=bcd` and should not give the same hash code. – Rob Sep 07 '15 at 08:33
  • That is called a collision, nothing bad with them. Don't want to have collisions to often? Use real names for your kids, don't just go through all options. – Vitaly Sep 07 '15 at 08:49
  • There most definitely is something bad about collisions, they're never desirable. Doubly so when you can easily avoid them. And what exactly is a 'real name'? `Rob` and `Robert` are both valid names, and it's not hard to find a situation where you'll get a collision because of the family name. Don't write bad code and then blame the user. – Rob Sep 07 '15 at 08:52
  • Taking into account the class purpose, the collisions above (`ab` and `cd` v. `abc` and `d`) are not expected. What's wrong with the code is `String.Concat(FirstName, LastName)` - it takes too much computation resources. Hashes should be computed *fast*. – Dmitry Bychenko Sep 07 '15 at 08:56
  • Correct, we should avoid collisions, but since hash-code is a limited number, to avoid we must shift collisions to unreal or rare values. And that's what your example is - unreal situation, which is good to be a collision, reducing collisions from a real persons. Have you noticed, that class is Person, not SomeTwoStringsThatCanDistinctBy1OR2Letters – Vitaly Sep 07 '15 at 08:57
  • @DmitryBychenko Even if concatenating strings were a fast operation, this would still be bad code. You should not ever combine values *prior* to generating the hash code, otherwise you're just asking for a bad time. – Rob Sep 07 '15 at 08:58
  • @Vitaly That still does not excuse you from writing flawed code. There is *no* benefit to concatenating before hashing, and recommending it is deliberately giving a bad answer. Whether or not it's `FirstName` and `LastName` or not is not an issue. Why not just write `return (FirstName + LastName)Sum(c => (int)c)` If we're in the mood for quick and easy code without caring about quality? – Rob Sep 07 '15 at 09:01
  • >>There is no benefit to concatenating before hashing - You get a "unique" string after concatenation (for the current context), don't you? >>Why not just write - This way you have collisions all the time – Vitaly Sep 07 '15 at 09:16
  • >>Whether or not it's FirstName and LastName or not is not an issue - I can't agree with you on that. I don't think solving such task in "common way" is good idea. When we're talking about Person, we have much less strings, than if we go through all the options in the strings and that should be taken in consideration – Vitaly Sep 07 '15 at 09:21
-1

You can do something like this:

public override int GetHashCode()
{
    return FirstName.GetHashCode() ^ LastName.GetHashCode()
}

Check this for more

Community
  • 1
  • 1