0

I found this sentence in my book:

If hashcodes of two objects are equals, that may not mean that objects are equals.

Can someone please explain me this sentence?

Raedwald
  • 46,613
  • 43
  • 151
  • 237
user5507755
  • 251
  • 1
  • 9
  • There is a difference between 2 equal objects (= same address in memory) and 2 objects with equal content (= different address in memory). – Jörn Buitink Dec 10 '15 at 08:50
  • 1
    Possible duplicate of [Hashcode() Vs Equals()](http://stackoverflow.com/questions/11850929/hashcode-vs-equals) – Gaurav Jeswani Dec 10 '15 at 08:53
  • If hashcodes are of type `int` and your object contains a `String` you cannot possibly have a different hashcode for each object. Therefore for two objects to be equal it is a **neccessary** but not **sufficient** condition for the hashcodes to be equal. Put into other words, a hashcode has no **type 1 error** but in order to achieve that it has a potentially high rate of **type 2 error**. – Boris the Spider Dec 10 '15 at 08:55
  • The sentence is clear enough statement of a fact. What do you not understand about it? – Raedwald Dec 11 '15 at 09:17

6 Answers6

11

Consider, for example, two objects of the Long class. Since hashCode returns an int, and the long (and Long) type has a larger range than int, this means there must be two Long objects that have the same hashCode even though they are not equal to each other.

Eran
  • 387,369
  • 54
  • 702
  • 768
  • There is a term for this: Pigeon hole principle If there are more pigeons than holes you'll end up with multiple pigeons in some holes. (see https://en.wikipedia.org/wiki/Pigeonhole_principle) – Roy van Rijn Dec 10 '15 at 10:18
6

Answer is simple: hashCode() accidentally can produce the same number for two totally different objects.

G. Demecki
  • 10,145
  • 3
  • 58
  • 58
  • 2
    I would suppose that "_accidentally can produce_" is the source of the downvotes. This implies that hashcode is some random magic method that produces random values. In fact, the decision to produce clashes is **very deliberate**. And the decision as to which clashes to produce is an important performance optimisation. – Boris the Spider Dec 10 '15 at 08:57
  • 2
    Of course default implementation of the `hashCode` in HotSpot is a random number. If you overwrite it, then it s no longer a random value. See [this post](http://stackoverflow.com/a/26975908/1037316) for example. – G. Demecki Dec 10 '15 at 09:02
  • 3
    If it were a random number then the hashcode would not work. The default implementation is that the hashcode is based on some internal JDK mechanics, usually the memory location of the object. **The hashcode is never random**. – Boris the Spider Dec 10 '15 at 09:03
  • Nope, this really is random number which is then cached internally inside object header. But that is *implementation detail*. I don't blame you, because almost everybody makes the same mistake as you. – G. Demecki Dec 10 '15 at 09:07
  • Sure you are not thinking of the `serialVersionUID`? If you read the [documentation](https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#hashCode--) you will find you are mistaken: "_[t]his is typically implemented by converting the internal address of the object into an integer_". If it were "_cached internally inside object header_" then it would be at a **class** level and not an **instance** level and would not work. – Boris the Spider Dec 10 '15 at 09:08
  • Nope :-) please take a look at the [post](http://stackoverflow.com/a/26975908/1037316) I pasted above. Almost everybody makes that mistake because even javadoc of the `hashCode` method is misleading - it is obsolete and no longer true. Do you want confirmation? Read `HotSpot` source code:) – G. Demecki Dec 10 '15 at 09:11
  • 1
    @Boris To be honest: not really :) Javadoc says _"(..) it is *typically* implemented"_. So `hashCode` may be related to a memory address, but it doesn't have to be. Javadoc is not wrong, it just can be a bit misleading. – G. Demecki Dec 10 '15 at 12:27
5

A hash code is a numeric value that is used to insert and identify an object in a hash-based collection.

It is a fixed size value so it can't be unique for every existing object so from time to time it suffers collisions. Basically, hashCode() can produce the same value for two different objects.

Example:

    String first = "wh";
    String second = "xI";
    System.out.println(first.equals(second));
    System.out.println(first.hashCode() + " " + second.hashCode());
Liviu Stirb
  • 5,876
  • 3
  • 35
  • 40
1

In hash base implementation when ever you check for equality of two objects it check first hash code first, if it's same for both objects, then it calls equals method it that also return true then only two objects are considered equal.

Gaurav Jeswani
  • 4,410
  • 6
  • 26
  • 47
0

2 equal object will have the same hashcode.

2 objects with the same hascode don't have to be equal.

Lets say the hascode method produces it's value by counten the letters of a name (bad practice, but I'm using this example to explain it), but the equals compares each character:

Paul and Mary would both return the hascode 4, but the characters are not equal

Philipp Sander
  • 10,139
  • 6
  • 45
  • 78
0

Even if the hashCode of an object would be the memory address at creation time it still must be stored inside the object header because of the garbage collector.

The garbage collector may freely move objects around to do its work, so the current memory address may change anytime. However:

the {@code hashCode} method must consistently return the same integer, provided no information used in {@code equals} comparisons on the object is modified.

Thomas Kläger
  • 17,754
  • 3
  • 23
  • 34