21

Folks, here's a thorny problem for you!

A part of the TickZoom system must collect instances of every type of object into a Dictionary<> type.

It is imperative that their equality and hash code be based on the instance of the object which means reference equality instead of value equality. The challenge is that some of the objects in the system have overridden Equals() and GetHashCode() for use as value equality and their internal values will change over time. That means that their Equals and GetHashCode are useless. How to solve this generically rather than intrusively?

So far, We created a struct to wrap each object called ObjectHandle for hashing into the Dictionary. As you see below we implemented Equals() but the problem of how to calculate a hash code remains.

public struct ObjectHandle : IEquatable<ObjectHandle>{
    public object Object;
    public bool Equals(ObjectHandle other) {
        return object.ReferenceEquals(this.Object,other.Object);
    }
}

See? There is the method object.ReferenceEquals() which will compare reference equality without regard for any overridden Equals() implementation in the object.

Now, how to calculate a matching GetHashCode() by only considering the reference without concern for any overridden GetHashCode() method?

Ahh, I hope this give you an interesting puzzle. We're stuck over here.

Sincerely, Wayne

Wayne
  • 2,959
  • 3
  • 30
  • 48

3 Answers3

25

RuntimeHelpers.GetHashCode() does exactly what is needed here.

Athari
  • 33,702
  • 16
  • 105
  • 146
Wayne
  • 2,959
  • 3
  • 30
  • 48
  • 1
    is returned value constant during lifetime of object? if yes, then this is really nice thing. – Andrey May 31 '10 at 18:51
  • 2
    Yes. It's constant. The GC keeps a handle to the object. The actual location of it can be moved around. So it hashes the handle to it. – Wayne Jun 01 '10 at 16:34
  • 1
    @Wayne: One might expect things to work that way, but they don't. Each object's header contains 32 bits that may *either* hold the value that should be returned by `RuntimeHelpers.GetHashCode()` or an index into a table that includes a number of things including the `RuntimeHelpers.GetHashCode()` value (certain ranges of integer values are used for different purposes). Object relocation is handled by tracking down and changing every reference to an object that exists anywhere in the universe. – supercat Nov 18 '13 at 23:56
  • @Wayne: Note that if at some moment in time a field holds the only reference to some object anywhere in the universe (say, an instance of `String`), there is no observation the code could have made before that point in time which would always be sufficient to determine at some later time whether the field still refers to that same object, rather than referring to a new object containing the same data. – supercat Nov 19 '13 at 00:00
  • RuntimeHelpers.GetHashCode(obj) returns same thing as GetHashCode(obj) – gdbdable Jan 15 '14 at 08:57
  • @gdbdable I'm don't think that's true – kofifus Jul 09 '20 at 03:43
2

You are breaking the pattern, this leads to questions that can't be solved. Method Equals should compare the contents of objects, instead of comparing references. That is what object.Equals does, why override with same behavior?

Now about GetHashCode. Again, hashcode is hash function applied to contents of object. You can't calculate it by reference only. You could use pointer to object and use it as hash, but in .net objects' addresses can be changed by GC.

Andrey
  • 59,039
  • 12
  • 119
  • 163
  • You are correct. But TickZoom does some very advanced and high performance programming techniques ordinarily only seen in C++ rather than C#. In this case we're doing aspect oriented programming to spit out trace of method calls in UML sequence format for Trace2UML tool to automatically generate UML sequence diagrams. The aspect must keep a record of each object it has already seen so that when it sees a new one it can emit the text command to draw that instance on the diagram. So it's irrelevant the contents of the object. Simple an internally almost compiler like issue. – Wayne May 31 '10 at 17:58
  • @Wayne: ok. if you want to store something in Dictionary then you **must** override correctly `GetHashcode`, otherwise you can't use dictionary. well, you can, but it becomes inefficient. use linked lists then. – Andrey May 31 '10 at 18:46
  • Correct answer is below because the GC has a unique handle to every item but it won't share that with you. .NET provides Equals() and GetHashCode() for the default object if you don't override which are based on that handle and only compares the reference equality. So if you don't override them that's all you get. My need was to use the original default Equals and GetHashCode and .NET offers access to those to get around somebody overriding them with RuntimeHelpers.Equals and RuntimeHelpers.GetHasCode. – Wayne Jun 01 '10 at 16:37
  • 1
    @Wayne i had a feeling that you can get the handle, or hash of handle somehow. now i know that RuntimeHelpers.GetHashCode() can do it :) – Andrey Jun 01 '10 at 16:53
  • @Wayne: In some frameworks the GC keeps a unique handle for every item which will never change during its lifetime, but that is not true for common implementations of .NET. Every object has a unique address, but when the GC relocates an object it will track down every reference and change it to reflect the new address. When the default `GetHashCode` is called the first time on an object, the system will arbitrarily select a number and store it in a word where it shares space with some other bits; there's no particular reason to expect that number not to collide with other live objects. – supercat Jun 16 '13 at 01:53
  • @supercat. Thanks. Does that cause a problem? I think GetHashCode() is expected to have collissions. What is critical is that RuntimeHelpers.Equals() can always give an correct answer of TRUE if comparing an instance to itself and FALSE if comparing to another instance. – Wayne Jun 19 '13 at 00:06
  • @Wayne: Properly-written code should not rely upon `GetHashCode` values being unique, but not all code is properly written. Hash collisions using `RuntimeHelpers.GetHashCode` will be infrequent enough that they should not pose a performance problem in code that has any sane method of handling collisions, but they'll be frequent enough that code which can't handle hash collisions will fail at least occasionally. – supercat Jun 19 '13 at 15:08
0

The hash code doesn't have to be unique. (But uniqueness will improve performance).
So - one thing you can do is set the hash code by the type name. All object from the same type will have the same hash code.

Itay Karo
  • 17,924
  • 4
  • 40
  • 58
  • why use hashcode if it is constant? it will make any Dictionary linked list. – Andrey May 31 '10 at 16:42
  • Correct. MSDN says that hash code should be randomly distributed over the range of value for best performance. Keep in mind it is possible to need to use an object BOTH ways. Hashed by it's values and also, in a different context, hashed by only it's reference. That's the problem in this case. And I found the answer but posted it separately as an answer. Hope you benefit also. – Wayne May 31 '10 at 17:54