11

I was trying to understand the hashCode() for Java's Object and saw the following code for Java Object's hashCode() method:

package java.lang;
public class Object {

 // Some more code

 public native int hashCode();

 // Some other code

}

Now, we know that if we create a class it implicitly extends the Object class, and to do this I wrote a sample example:

package com.example.entity;
public class FirstClass {
    private int id;
    private String name;
    // getters and setters
}

So, this class viz: FirstClass would be extending the Object class implicitly.

Main class:

package com.example.app.main;
import com.example.entity.FirstClass;
    public class MainApp {
        public static void main(String[] args) {
             FirstClass fs = new FirstClass();
             fs.setId(1);
             fs.setName("TEST");
             System.out.println("The hasCode for object fs is " + fs.hashCode());
         }
 }

As FirstClass is extending Object class implicitly, hence it would have Object classes' hashCode() method.

I invoked the hashCode() on FirstClass object , and as I haven't overridden the hashCode(), by theory it should invoke Object class's hashCode().

My doubt is:

As Object class don't have any implementation, how is it able to calculate the hash-code for any object?

In my case, when I run the program the hash-code which it returned was 366712642.

Can anyone help me understand this?

Stefan Zobel
  • 3,182
  • 7
  • 28
  • 38
CuriousMind
  • 8,301
  • 22
  • 65
  • 134
  • 2
    What makes you think that the `object` class of Java didn't implement `hashCode`? – Neijwiert Mar 08 '18 at 12:08
  • 4
    It has `native` modifier, which means it's implemented most likely in C or C++ and those native methods come with JDK. https://stackoverflow.com/questions/18900736/what-are-native-methods-in-java-and-where-should-they-be-used. Note that Native and abstract methods are not the same ! – whatamidoingwithmylife Mar 08 '18 at 12:08
  • 2
    @Neijwiert: Yes I don't have full understanding on this, that is why I asked this question. As I see only the method declaration (with no body), I got this doubt how it is able to still get the value. – CuriousMind Mar 08 '18 at 12:30

3 Answers3

33

Even though there are some answers here stating that the default implementation is "memory" based, this is plain wrong. This is not the case for a lot of years now.

Under java-8, you can do :

java -XX:+PrintFlagsFinal | grep hashCode

To get the exact algorithm that is used (5 being default).

  0 == Lehmer random number generator, 
  1 == "somehow" based on memory address
  2 ==  always 1
  3 ==  increment counter 
  4 == memory based again ("somehow")
  5 == read below

By default (5), it is using Marsaglia XOR-Shift algorithm, that has nothing to do with memory.

This is not very hard to prove, if you do:

 System.out.println(new Object().hashCode());

multiple times, in a new VM all the time - you will get the same value, so Marsaglia XOR-Shift starts with a seed (always the same, unless some other code does not alter it) and works from that.

But even if you switch to some hashCode that is memory based, and Objects potentially move around (Garbage Collector calls), how do you ensure that the same hashCode is taken after GC has moved this object? Hint: indentityHashCode and Object headers.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • Thanks for the information. What information this flag shows? I have never seen this option, can you explain a bit? – CuriousMind Mar 08 '18 at 17:44
  • @CuriousMind I don't know what more can I add here... there are 5 options for hashCode as I said in my answer, default being `5`. What type of *more* would you have in ming here? – Eugene Mar 08 '18 at 19:49
  • I was taking about -XX:+PrintFlagsFinal ; your answer is perfect; however i have never heard this option ; it was this information I was asking for. However, I will try to find online. Thanks for taking time and helping. – CuriousMind Mar 08 '18 at 19:55
  • 2
    @CuriousMind oh that one. it just prints all the flags that a JVM has and their default values... – Eugene Mar 08 '18 at 19:57
  • 3
    One never stops learning. Will see what I can do about updating my answer to be more accurate. – GhostCat Mar 08 '18 at 20:01
  • @Eugene is there way to change the algorhytm of hashcode calculation ? from 5 to 3 for example – gstackoverflow Jun 08 '22 at 21:13
  • @gstackoverflow interesting that in jdk-17 that I currently have `java -XX:+PrintFlagsFinal -version | grep code` shows that such an option is not present anymore. I'll look around to see why – Eugene Jun 09 '22 at 05:27
  • @Eugene interesting article: https://shipilev.net/jvm/anatomy-quarks/26-identity-hash-code/ – gstackoverflow Jun 09 '22 at 12:27
7

You are getting things wrong:

public native int hashCode();

doesn't mean there is no implementation. It just means that the method is implemented in the native aka C/C++ parts of the JVM. This means you can't find Java source code for that method. But there is still some code somewhere within the JVM that gets invoked whenever you call hashCode() on some Object.

And as the other answer explains: that "default" implementation used the "memory" address of the underlying object. Thing is: using java means, there is no knowledge of "memory addresses". Keep in mind: the JVM is written in C/C++ - and the real memory management happens in these native parts of the JVM.

In other words: you can't write Java code that tells you about the "native memory address" of an object.

But as the other answer by Eugene makes clear: the hash being about "memory location" is a thing of the past.

GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • Thanks for the answer. So, if I understood correct, even if hashCode() is native, the real implementation is within the JVM, and which is invoked at run time. Are there any compelling reasons that they made this method as native? For a given Object, will the hash-code be different for different platforms, as native is dependent on the platform? – CuriousMind Mar 08 '18 at 12:46
  • 1
    @CuriousMind I added another paragraph to my answer. Hope that helps. – GhostCat Mar 08 '18 at 12:54
  • Thanks a lot GhostCat! I am able to understand it now. BTW, can't we see the source code of C,C++ which actually implements this hashCode()? I saw this link, is this the correct source for this? http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/9b8c96f96a0f/src/share/native/java/lang/Object.c – CuriousMind Mar 08 '18 at 13:39
  • 1
    @GhostCat wrong. The default implementation is *not* memory based. Read this: https://stackoverflow.com/a/49175508/1059372 – Eugene Mar 08 '18 at 14:28
  • @GhostCat *you can't write Java code that tells you about the "native memory address* wrong too. `Unsafe` can do that. But even if you can, GC can move that around, so the numbers would not make any sense anyway – Eugene Mar 08 '18 at 14:33
  • 2
    @Eugene there are two aspects here. Since memory addresses of objects can change when the garbage collector moves them, they have to remember their reported hash code somehow, once it has been queried. But then, there is the problem that objects are created within a thread’s TLAB, in other words, the same memory region, before being moved to the survivor space, if still reachable. So using memory based hash codes bears the risk of having very close values (a poor hash distribution). – Holger Mar 08 '18 at 16:23
  • 1
    @Holger right, IIRC there is a flag in objects header for that, if hashCode has been already computed or not and kept in the space for identityHashCode of that header – Eugene Mar 08 '18 at 19:48
  • 5
    @Eugene The spec for `Object.hashCode` *still* mentions the object's address, which is misleading. I've filed [JDK-8199394](https://bugs.openjdk.java.net/browse/JDK-8199394) to fix this. – Stuart Marks Mar 09 '18 at 06:06
  • 1
    @StuartMarks if you are at it, [this tutorial](https://docs.oracle.com/javase/tutorial/java/IandI/objectclass.html) needs a fix too and it’s even worse. It says “*The value returned by hashCode() is the object's hash code, which is the object's memory address in hexadecimal.*” (as has been cited by one answer here). Saying that an `int` value is “in hexadecimal” makes no sense and that’s a *tutorial* intended to teach beginners the Java programming language… – Holger Mar 09 '18 at 07:46
  • 1
    @Holger Yes, the tutorial is clearly in error. There's another bug for that. While I can change the specification (javadoc) myself, the tutorial is handled differently, and unfortunately it's proven difficult to get it updated. – Stuart Marks Mar 09 '18 at 15:14
1

The default implementation of Hashcode in object class is the object's memory address in hexadecimal. The JVM invokes the implementation of this.

Some helpful links are:

https://docs.oracle.com/javase/tutorial/java/IandI/objectclass.html https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html

Spandan Thakur
  • 338
  • 2
  • 14
  • 1
    wrong. The default implementation is *not* memory based – Eugene Mar 08 '18 at 14:28
  • 4
    Besides not being a memory address, the hash code is not “in hexadecimal”. That doesn’t make any sense; the hash code is just an `int`. The default implementation of `toString()` produces a hexdecimal representation of the hash code value, but that’s not a property of the hash code itself. – Holger Mar 09 '18 at 07:35
  • thanks for the links! don't know why the comments arguing it's not address based when it's written clearly in the oracle documents – dontloo Oct 10 '19 at 02:40
  • 2
    @dontloo because even oracle documents can be wrong or misleading. This had been addressed in [this comment](https://stackoverflow.com/questions/49172698/default-hashcode-implementation-for-java-objects#comment85380333_49172749) even before you wrote your comment. And consequently, [up-to-date versions of this documentation](https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/lang/Object.html#hashCode()) do not mention memory addresses anymore. – Holger Apr 23 '21 at 07:35
  • @Holger it is unreasonable to expect someone to read every comment on every answer before commenting. Eugene's comment was not helpful; it is never helpful to simply comment "wrong". He should have backed up his assertion with links, much like you helpfully did with your comment. – Paul Jan 22 '22 at 21:48
  • 1
    @Paul well, Eugene also wrote an answer providing more details. Besides that, if someone takes the time to read the lowest-scored answer and the comments below it and to write a comment, is it really unreasonable to expect that person to also read the other answers which other users acknowledged with their votes to be more useful and perhaps their comments, especially the one with a highlighted positive vote count? – Holger Jan 24 '22 at 07:49