1

I hate to beat a dead horse. In @eric-lippert's blog, he states:

the hash value of an object is the same for its entire lifetime

Then follows up with:

However, this is only an ideal-situation guideline

So, my question is this.

For a POCO (nothing overriden) or a framework (like FileInfo or Process) object, is the return value of the GetHashCode() method guaranteed to be the same during its lifetime?

P.S. I am talking about pre-allocated objects. var foo = new Bar(); Will foo.GetHashCode() always return the same value.

AngryHacker
  • 59,598
  • 102
  • 325
  • 594
  • 1
    `Is the return value of the GetHashCode() method guaranteed to be the same during its lifetime`, **no**. Easy to test, `Console.WriteLine("What am I?".GetHashCode()) Console.WriteLine("What am I?".GetHashCode())` run those and you will see. The `GetHashCode` was never meant to be actually stable; one reason is because it can be different across different implementations in .NET – Trevor May 02 '19 at 21:02
  • @Çöđěxěŕ It is the same. – AngryHacker May 02 '19 at 21:06
  • @Çöđěxěŕ That's why I said said for the lifetime of the object. – AngryHacker May 02 '19 at 21:08
  • 1
    Sh!tness, I apologize, yes lifetime in the same program execution, **yes**, but not different program executions. For example run it in a console app, and then the same for another app and you should see the difference... – Trevor May 02 '19 at 21:09
  • Strictly speaking (i.e., being pedantic), it is a convention that an object instance should return the same hash value during its lifetime. A convention which dictionaries, other collections and other components in .NET rely on to function properly and reliably. However, the .NET framework cannot guarantee that this convention is being kept in any case -- if you write your own class with a GetHashCode() override that violates this convention, there is nothing stopping and preventing you from doing that, even if it is a completely bonkers/bad thing to do... –  May 02 '19 at 21:12
  • @elgonzo So yeah, that's the crux of the question. Without any overrides for my own objects can I count on GetHashCode being the same during lifetime across whatever threads I cook up in my app? – AngryHacker May 02 '19 at 21:17
  • @AngryHacker by the way, this is a great question. – Trevor May 02 '19 at 21:18
  • @elgonzo Additionally, can I also count on framework based objects not to mess with the guideline based GetHashCode implementation? – AngryHacker May 02 '19 at 21:18
  • 1
    If you don't override GetHashCode yourself, you are safe. If you do override GetHashCode and adhere to this convention, you are safe. The .NET framework/standard/core classes adhere to this convention (i am not aware of any type in the framework library that would violate this convention). And unless you use 3rd-party code from an utterly incompetent programmer (pardon), types from 3rd-party libraries should make no problems either... –  May 02 '19 at 21:21

2 Answers2

7

If you look at the MSDN documentation you will find the following remarks about the default behavior of the GetHashCode method:

If GetHashCode is not overridden, hash codes for reference types are computed by calling the Object.GetHashCode method of the base class, which computes a hash code based on an object's reference; for more information, see RuntimeHelpers.GetHashCode. In other words, two objects for which the ReferenceEquals method returns true have identical hash codes. If value types do not override GetHashCode, the ValueType.GetHashCode method of the base class uses reflection to compute the hash code based on the values of the type's fields. In other words, value types whose fields have equal values have equal hash codes

Based on my understanding we can assume that:

  • for a reference type (which doesn't override Object.GetHashCode) the value of the hash code of a given instance is guaranteed to be the same for the entire lifetime of the instance (because the memory address at which the object is stored won't change during its lifetime)
  • for a value type (which doesn't override Object.GetHashCode) it depends: if the value type is immutable then the hash code won't change during its lifetime. If, otherwise, the value of its fields can be changed after its creation then its hash code will change too. Please, notice that value types are generally immutable.

IMPORTANT EDIT

As pointed out in one comment above the .NET garbage collector can decide to move the physical location of an object in memory during the object lifetime, in other words an object can be "relocated" inside the managed memory.

This makes sense because the garbage collector is in charge of managing the memory allocated when objects are created.

After some searches and according to this stackoverflow question (read the comments provided by the user @supercat) it seems that this relocation does not change the hash code of an object instance during its lifetime, because the hash code is calculated once (the first time that it's value is requested) and the computed value is saved and reused later (when the hash code value is requested again).

To summarize, based on in my understanding, the only thing you can assume is that given two references pointing to the same object in memory the hash codes of them will always be identical. In other words if Object.ReferenceEquals(a, b) then a.GetHashCode() == b.GetHashCode(). Furthermore it seems that given an object instance its hash code will stay the same for its entire lifetime, even if the physical memory address of the object is changed by the garbage collector.

SIDENOTE ON HASH CODES USAGE

It is important to always remember that the hash code has been introduced in the .NET framework at the sole purpose of handling the hash table data structure.

In order to determine the bucket to be used for a given value, the corresponding key is taken and its hash code is computed (to be precise, the bucket index is obtained by applying some normalizations on the value returned by the GetHashCode call, but the details are not important for this discussion). Put another way, the hash function used in the .NET implementation of hash tables is based on the computation of the hash code of the key.

This means that the only safe usage for an hash code is balancing an hash table, as pointed out by Eric Lippert here, so don't write code which depends on hash codes values for any other purpose.

Enrico Massone
  • 6,464
  • 1
  • 28
  • 56
  • This answers my question. Additional one... How would you classify `string`? `string s = "1"; Console.WriteLine(s.GetHashCode());s = "2"; Console.WriteLine(s.GetHashCode());` produces 2 different results. – AngryHacker May 02 '19 at 21:25
  • System.String is a reference type. In your example you get two different hash codes because you are referencing two different strings (the string you get from the literal "1" and the one you get from "2" are two different objects in memory). – Enrico Massone May 02 '19 at 21:28
  • 3
    "because the memory address at which the object is stored won't change during its lifetime" - I thought the garbage collector could relocate objects? – Joe Sewell May 02 '19 at 21:28
  • @EnricoMassone I thought so. Thanks for confirming. – AngryHacker May 02 '19 at 21:30
  • @AngryHacker, note that "1" and "2" are two different object instances (string instances). There is no such convention/rule that stipulates that two different object instances must return equal hash codes (unless both instances are considered to be equal, in which case the hash values of each instance have to be equal as well) –  May 02 '19 at 21:31
  • @JoeSewell thanks for the comment ! Take a look at my edit. – Enrico Massone May 02 '19 at 22:09
2

There are three cases.

  1. A class which does not override GetHashCode
  2. A struct which does not override GetHashCode
  3. A class or struct which does override GetHashCode

If a class does not override GetHashCode, then the return value of the helper function RuntimeHelpers.GetHashCode is used. This will return the same value each time it's called for the same object, so an object will always have the same hash code. Note that this hash code is specific to a single AppDomain - restarting your application, or creating another AppDomain, will probably result in your object getting a different hash code.

If a struct does not override GetHashCode, then the hash code is generated based the hash code of one of its members. Of course, if your struct is mutable, then that member can change over time, and so the hash code can change over time. Even if the struct is immutable, that member could itself be mutated, and could return different hash codes.

If a class or struct does override GetHashCode, then all bets are off. Someone could implement GetHashCode by returning a random number - that's a bit of a silly thing to do, but it's perfectly possible. More likely, the object could be mutable, and its hash code could be based off its members, both of which can change over time.

It's generally a bad idea to implement GetHashCode for objects which are mutable, or in a way where the hash code can change over time (in a given AppDomain). Many of the assumptions made by classes like Dictionary<TKey, TValue> break down in this case, and you will probably see strange behaviour.

canton7
  • 37,633
  • 3
  • 64
  • 77
  • When you say `bad idea to implement GetHashCode for objects which are mutable`, do you mean override the method in your own class? – AngryHacker May 02 '19 at 21:33
  • 1
    Yes. It's a bad idea to override GetHashCode in an object which is mutable. You can only override any method in your own classes - you can't override a method in someone else's class – canton7 May 02 '19 at 21:35
  • 1
    It's a bad idea to have a hash code that changes over an object's lifetime, so it's a bad idea to implement GetHashCode for objects which are mutable... Unless you don't consider any of the mutable members to be part of the identity. But this does have very limited applicability. – John Gietzen Feb 25 '20 at 03:25