Explanation
Your first paragraph describes the default behavior of hashCode
. But usually classes override it and create a content-based solution (same as for equals
). This especially applies to the String
class.
Default hashCode
The default implementation is not done in Java but directly implemented in the JVM, it has a native
keyword. You can always get hands on the original hashCode
by using System#identityHashCode
, see its documentation:
Returns the same hash code for the given object as would be returned by the default method hashCode()
, whether or not the given object's class overrides hashCode()
. The hash code for the null reference is zero.
Note that the default implementation of hashCode
is not necessarily based on the memory location. It often is related, but you can by no means rely on that (see How is hashCode() calculated in Java). Here is the documentation of Object#hashCode
:
Returns a hash code value for the object. This method is supported for the benefit of hash tables such as those provided by HashMap.
The general contract of hashCode is:
- Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
- If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
- It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
The relevant parts are the second and third requirement. It must behave the same as equals
and hash-collisions are okay (but not optimal).
And Object#equals
is typically used to create custom content-based comparisons (see documentation).
String hashCode
Now let us take a look at the implementation of String#hashCode
. As said, the class overrides the method and implements a content-based solution. So the hash for "hello"
will always be the same as for "hello"
. Even if you force new instances using the constructor:
// Will have the same hash
new String("hello").hashCode()
new String("hello").hashCode()
It works exactly as equals
, which would output true
here as well:
new String("hello").equals(new String("hello")) // true
as required by the contract of the hashCode
method (see documentation).
Here is the implementation of the method (JDK 10):
/**
* Returns a hash code for this string. The hash code for a
* {@code String} object is computed as
* <blockquote><pre>
* s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
* </pre></blockquote>
* using {@code int} arithmetic, where {@code s[i]} is the
* <i>i</i>th character of the string, {@code n} is the length of
* the string, and {@code ^} indicates exponentiation.
* (The hash value of the empty string is zero.)
*
* @return a hash code value for this object.
*/
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
hash = h = isLatin1() ? StringLatin1.hashCode(value)
: StringUTF16.hashCode(value);
}
return h;
}
Which just forwards to either StringLatin1
or StringUTF16
, let us see what they have:
// StringLatin1
public static int hashCode(byte[] value) {
int h = 0;
for (byte v : value) {
h = 31 * h + (v & 0xff);
}
return h;
}
// StringUTF16
public static int hashCode(byte[] value) {
int h = 0;
int length = value.length >> 1;
for (int i = 0; i < length; i++) {
h = 31 * h + getChar(value, i);
}
return h;
}
As you see, both of them just do some simple math based on the individual characters in the string. So it is completely content-based and will thus obviously result in the same result for the same characters always.