43

If I call the Object.hashcode() method on some object it returns the internal address of the object (default implementation). Is this address a logical or physical address?

In garbage collection, due to memory compaction objects shifting takes place in the memory. If I call hashcode before and after the GC, will it return the same hashcode (it returns) and if yes then why (because of compaction address may change) ?

Wilfred Hughes
  • 29,846
  • 15
  • 139
  • 192
Ashish
  • 2,521
  • 2
  • 21
  • 21
  • If you print the value of a few `Object.hashCode`s, you'll probably notice that they are unlikely to be addresses. Odd numbers on any reasonable implementation, for instance. – Tom Hawtin - tackline Sep 26 '10 at 14:14

5 Answers5

61

@erickson is more or less correct. The hashcode returned by java.lang.Object.hashCode() does not change for the lifetime of the object.

The way this is (typically) implemented is rather clever. When an object is relocated by the garbage collector, its original hashcode has to be stored somewhere in case it is used again. The obvious way to implement this would be to add a 32 bit field to the object header to hold the hashcode. But that would add a 1 word overhead to every object, and would waste space in the most common case ... where an Object's hashCode method is not called.

The solution is to add two flag bits to the object's flag word, and use them (roughly) as follows. The first flag is set when the hashCode method is called. A second flag tells the hashCode method whether to use the object's current address as the hashcode, or to use a stored value. When the GC runs and relocates an object, it tests these flags. If the first flag is set and second one is unset, the GC allocates one extra word at the end of the object and stores the original object location in that word. Then it sets the two flags. From then on, the hashCode method gets the hashcode value from the word at the end of the object.


In fact, an identityHashCode implementation has to behave this way to satisfy the following part of the general hashCode contract:

"Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application."

A hypothetical implementation of identityHashCode() that simply returned the current machine address of an object would violate the highlighted part if/when the GC moved the object to a different address. The only way around this would be for the (hypothetical) JVM to guarantee that an object never moves once hashCode has been called on it. And that would lead to serious and intractable problems with heap fragmentation.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • 2
    Great explanation Stephen! Your description of working of hashCode() clarifies how hashCode() retains the same value throughout the program run. Meanwhile if a GC+memory compaction takes place, and a new object (whose hashCode() has not been invoked yet) is allocated the same space as the old one, then wouldn't the hashCode() value be same as that of the active object that initially occupied the memory location? How does this affect object equality and Hash based collections? – Ashwin Prabhu Sep 19 '11 at 08:52
  • That is explained by the 3rd paragraph of my answer. Basically, the original address / hashcode is stored at the end of the object when it is relocated. But only when necessary; i.e. only if `identityHashcode()` *has been* called. – Stephen C Sep 19 '11 at 10:53
  • 2
    What I meant was, Object1 has hasCode 100 and this is copied into the extra word at end of Object1. At this point assume that a GC compaction takes place and Object1 is moved some place else, freeing its original memory location for newer allocations. Assume that due to some coincidence the new Object2 is somehow allocated at the old location of Object1. What will be the hashCode of Object2? won't it be 100? TH\his would mean Object1 (now moved elsewhere, but having hashCode 100 saved in the last word) and Object2 (allocated at Object1's old location) will share the same hashCode! – Ashwin Prabhu Sep 19 '11 at 16:03
  • 2
    @AshwinPrabhu - yes it will. But that doesn't matter. The identity hashcode is a hashcode ... not a unique identifier. – Stephen C Feb 22 '12 at 12:26
  • In OpenJDK, `hashCode()` is a [native method](https://github.com/openjdk/jdk/blob/jdk-11%2B28/src/java.base/share/native/libjava/Object.c#L43), which is related to specific JVM impl [like HotSpot](https://github.com/openjdk/jdk/blob/jdk-11+28/src/hotspot/share/prims/jvm.cpp#L595). While in Android world, it seems true that the "add two flag bits to the object's flag word" solution. That is, [`obj.shadow$_monitor_`](https://android.googlesource.com/platform/libcore/+/refs/tags/android-11.0.0_r42/ojluni/src/main/java/java/lang/Object.java#118). – Weekend Sep 07 '21 at 03:09
  • Yea ... there are a few ways that it can be implemented. But all ways must conform to the rule that "the identity hashcode never changes". – Stephen C Sep 07 '21 at 03:19
6

No, the default hash code of an object will not change.

The documentation doesn't say that the hash code is the address, it says that it is based on the address. Consider that hash codes are 32 bits, but there are 64-bit JVMs. Clearly, directly using the address wouldn't always work.

The implementation depends on the JVM, but in the Sun (Oracle) JVM, I believe the hash code is cached the first time it's accessed.

erickson
  • 265,237
  • 58
  • 395
  • 493
  • 2
    From Java Doc of hashCode: This is typically implemented by converting the internal address of the object into an integer – Ashish Sep 26 '10 at 06:55
  • actually, the hashcode is cached when the GC relocates an object ... if `hashcode()` has previously been called. – Stephen C Sep 26 '10 at 07:27
  • 1
    Actually Ashish, the javadoc says this: "This is typically implemented by converting the internal address of the object into an integer, **but this implementation technique is not required by the Java™ programming language.**" Indeed, recent JVMs have a command line option that allows you to choose other methods for generating hashcodes. – Stephen C May 10 '17 at 06:00
  • 1
    Also, "conversion" implies a fundamental change, not a simple, reversible type cast. – erickson May 10 '17 at 16:11
0

In this link it says that indeed the default hash code is the JVM address of the object, but if it is moved - the address stays consistent. I don't know how reliable this source is, but I am sure that the implementors of this method thought of this scenario (which is not rare or corner case), and ensured correct functionality of this method.

duduamar
  • 3,816
  • 7
  • 35
  • 54
0

By the contract of hashCode it cannot change for such a reason.

user207421
  • 305,947
  • 44
  • 307
  • 483
-3

if the hashcode changes, the object will disappear in a hash set which it was inserted into, and Sun will be flooded with complaints.

irreputable
  • 44,725
  • 9
  • 65
  • 93