24

I'm trying to understand the native implementation of the hashCode() method. What exactly does this method return? Is it a memory address or is it a random value?

RustyTheBoyRobot
  • 5,891
  • 4
  • 36
  • 55
johnny-b-goode
  • 3,792
  • 12
  • 45
  • 68
  • 4
    The OpenJDK sources should tell you. I doubt it's a memory address since the GC can move objects around in memory, it's probably some sort of internal object handle. – millimoose Apr 19 '13 at 13:11
  • This is an excerpt of said code from around 2008: http://blogs.tedneward.com/CommentView,guid,eca26c5e-307c-4b7c-931b-2eaf5b176e98.aspx – millimoose Apr 19 '13 at 13:13
  • And straight from the horse's mouth: http://hg.openjdk.java.net/jdk7/hotspot/hotspot/file/9b0ca45cd756/src/share/vm/runtime/synchronizer.cpp The code looks the same to me as in the blog post. – millimoose Apr 19 '13 at 13:39
  • 2
    possible duplicate of [What is an "internal address" in Java?](http://stackoverflow.com/questions/13860194/what-is-an-internal-address-in-java) – assylias Apr 19 '13 at 13:42

3 Answers3

40

.hashCode() native implementation depends on JVM.

E.g. HotSpot has 6 Object.hashCode() implementations. You can choose it using -XX:hashCode=n flag running JVM via command line, where n:

0 – Park-Miller RNG (default)
1 – f(address, global_statement)
2 – constant 1
3 – Serial counter
4 – Object address
5 – Thread-local Xorshift

bsiamionau
  • 8,099
  • 4
  • 46
  • 73
  • 3
    It's also interesting to note that because there is limited space to store things in the object header, the default hashcode is only [25 bits wide](https://stackoverflow.com/questions/26357186/what-is-in-java-object-header), not the full 32 bits of an int. – Boann Apr 13 '15 at 17:13
22

From the documentation:

As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

So it may be related to a memory address, but it doesn't have to be - and you definitely shouldn't make any assumption about it being related to memory at all.

Nothing you do with a hash code should care about this at all. The only things you should infer from hash codes are:

  • If the hash codes of two objects are the same, they may be equal objects
  • If the hash codes of two objects are different, they are not equal objects (assuming a correct implementation, whether overridden or not)
Nathan Hughes
  • 94,330
  • 19
  • 181
  • 276
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Considering GC, can the hash code even be the memory adress in the first place? For use in hash tables, it shouldn't really change unpredictably during execution. – millimoose Apr 19 '13 at 13:13
  • 5
    @millimoose: It definitely can't be the *current* memory address in the face of a compacting GC. But it could be "the address at the time of first call, which is then remembered for later" perhaps. I try not to care too much :) – Jon Skeet Apr 19 '13 at 13:15
  • 1
    Going by the [old code excerpt I found](http://blogs.tedneward.com/CommentView,guid,eca26c5e-307c-4b7c-931b-2eaf5b176e98.aspx), it definitely *seems* (insofar as I can read the somewhat hairy C) to be "some number that's determined once then saved". With six implementations available, including the initial memory address and a RNG. – millimoose Apr 19 '13 at 13:20
  • 2
    @GaborSch First of: hash codes aren't guaranteed to be unique. Second: the implementation might not really be used in a "real" VM, since it'd still have the problem where hash codes are adresses in the (relatively small) Eden Space. – millimoose Apr 19 '13 at 13:25
  • @millimoose Sorry, I was wrong, *uniqueness* is not required at all. And if you have large enough *Eden* space (or whatever that JVM is using), the hash values may be well distributed. – gaborsch Apr 19 '13 at 13:30
  • @GaborSch Yup. I noticed the part about uniqueness too late and realised my speculation makes no sense in light of that. – millimoose Apr 19 '13 at 13:37
  • I would add the following: how can a memory address on a 64 bit JVM give a unique 32-bit integer? So, yes, it cannot be the memory address. If we look here:http://hg.openjdk.java.net/jdk7/jdk7/hotspot/file/9b0ca45cd756/src/share/vm/oops/markOop.hpp we will see that for 32 bit machine the hash code is 25-bit wide, while on a 64 bit machine it is 31-bits wide. On every JVM you can have "twins" (different objects with the same hash code); even with -XX:hashCode=5, on Java 8 you can very quickly find twins! – Paul Ianas Sep 10 '14 at 08:12
3

Your answer lies here. As mentioned in the documentation:

As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

Nathan Hughes
  • 94,330
  • 19
  • 181
  • 276
AllTooSir
  • 48,828
  • 16
  • 130
  • 164