0

With respect to 3 contracts mentioned below:

1) Whenever hashCode() is invoked on the same object more than once during an execution of an application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

From this statement, i understand that, In a single execution of an application, if hashCode() is used one or more times on same object it should return same value.

2) If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same integer result.

From this statement, i understand that, to perform the equality operation(in broad scope) in your subclass, There are at least four different degrees of equality.

(a) Reference equality(==), comparing the internal address of two reference type objects.

(b) Shallow structural equality: two objects are "equals" if all their fields are ==. { For example, two SingleLinkedList whose "size" fields are equal and whose "head" field point to the same SListNode.}

(c) Deep structural equality: two objects are "equals" if all their fields are "equals". {For example, two SingleLinkedList that represent the same sequence of "items" (though the SListNodes may be different).}

(d) Logical equality. {Two examples: (a) Two "Set" objects are "equals" if they contain the same elements, even if the underlying lists store the elements in different orders. (b) The Fractions 1/3 and 2/6 are "equals", even though their numerators and denominators are all different.}

Based on above four categories of equality, second contract will hold good only: if(Say) equals() method returns truth value based on logical_equality between two objects then hashCode() method must also consider logical_equality amidst computation before generating the integer for every new object instead of considering internal address of a new object.

But i have a problem in understanding this third contract.

3) IT IS NOT REQUIRED that if two objects are unequal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

In second contract, As we are saying that hashCode() method should be accordingly[for ex: considering logical_equality before generating integer] implemented, I feel, It is not true to say that, if two objects are unequal according to equals(Object) then hashCode() method may produce same integer results as mentioned in third contract? As per the argument in second contract, hashCode() must produce distinct integer results. One just writing return 42 in hashCode() is breaking second contract!

please help me understand this point!

Roman C
  • 49,761
  • 33
  • 66
  • 176
overexchange
  • 15,768
  • 30
  • 152
  • 347
  • What value type does `hashCode` return? – Sotirios Delimanolis Dec 26 '14 at 14:45
  • hashCode() returns int – overexchange Dec 26 '14 at 14:46
  • It doesn't really matter what you feel. The documentation states that the hashcodes can be the same for distinct values and that's what matters. It's a design decision like any other - so far I haven't read any argument why it "**must**" produce distinct values. – Jeroen Vannevel Dec 26 '14 at 14:49
  • @JeroenVannevel that's true, documentation says: `hashcodes can be the same for distinct values` , then that means we should not link both methods in any context but we also say, `You must override hashCode() in every class that overrides equals()` because what exactly you want to ensure from `hashCode()` method, whenever you override `equals()` you override `hashCode()`? – overexchange Dec 26 '14 at 15:03
  • If you have a set of objects and you override one method but not the other, it might search for your object in the wrong bucket. Or it might search in the right bucket but it might not recognize it. More here: http://stackoverflow.com/questions/27581/what-issues-should-be-considered-when-overriding-equals-and-hashcode-in-java?rq=1 – Jeroen Vannevel Dec 26 '14 at 15:05
  • @MarkRotteveel Can somebody help me understand the reason for down vote? Please comment before down vote. – overexchange Dec 26 '14 at 17:26
  • I did not down vote, and know that people are not required to comment when they down vote. – Mark Rotteveel Dec 26 '14 at 18:01

4 Answers4

5

It would be impossible for hashCode() to always return different values for unequal objects. For example, there are 2^64 different Long values, but only 2^32 possible int values. Therefore the hashCode() method for Long has to have some repeats. In situations like this you have to try hard to ensure that your hashCode() method distributes values as evenly as possible, and is unlikely to produce repeats for the instances you are most likely to use in practice.

The second condition just says that two equal() instances must return the same hashCode() value, so this program must print true:

Long a = Long.MIN_VALUE;
Long b = Long.MIN_VALUE;
System.out.println(a.hashCode() == b.hashCode()); // a.equals(b), so must print true.

However this program also prints true:

Long c = 0L;
Long d = 4294967297L;
System.out.println(c.hashCode() == d.hashCode()); // prints true even though !c.equals(d)
Paul Boddington
  • 37,127
  • 10
  • 65
  • 116
  • if you say, `hashCode()` may not return different values for unequal objects, then i would ask, Does every new object in Java has distinct internal address given, so if you want to just return unique number from `hashCode()` rely on the address of object. that is what `class Object` `hashCode()` method does. – overexchange Dec 26 '14 at 15:20
  • @overexchange Again, no it doesn't. Some implementations convert the internal address to an integer. You may have more objects than there are `int` values. – Sotirios Delimanolis Dec 26 '14 at 15:22
  • @SotiriosDelimanolis if there are more objects than maxvalue of int, then what does jvm assign as an address to an object? – overexchange Dec 26 '14 at 15:31
  • @overexchange Maybe it uses 64bit addresses, maybe it uses 128bit addresses. Its an implementation detail. – Sotirios Delimanolis Dec 26 '14 at 15:34
  • @SotiriosDelimanolis May be, one should know the answer for this topic "How address is assigned for more than 2^31-1 objects?", before we go on these lines – overexchange Dec 26 '14 at 15:37
  • No, you don't need to. It's an implementation detail that is abstracted away from the Java Language and its libraries. – Sotirios Delimanolis Dec 26 '14 at 15:38
  • Ok, then how do we get convinced(prove) that if we follow this approach `typically implemented by converting the internal address of the object into an integer` we may get same same hashCode? not mathematically but running a sample code and following `class Object` 2 default methods without overriding. – overexchange Dec 26 '14 at 15:41
  • 1
    @overexchange You don't have to prove it. It's part of the specification. If it doesn't behave like the documentation says, then it is a bug. Let the developers take care of it. You're not going to start testing all of the Java libraries and JVM details. – Sotirios Delimanolis Dec 26 '14 at 15:47
  • @overexchange Alternatively, get a computer with enough memory and create more than `Integer.MAX_VALUE` number of objects and compare their `hashCode`. – Sotirios Delimanolis Dec 26 '14 at 15:48
  • @SotiriosDelimanolis user level programs(say java.exe process) can allocate objects only on virtual memory. So, virtual memory space for each native process is "8TB" using 64 bit OS, despite you have 4GB RAM chip working with 64 bit processor, Now Do you know how to create more than "Integer.MAX_VALUE" objects? – overexchange Dec 26 '14 at 16:00
  • @pbabcdefp Sorry, I did not know that object addresses range in `long` and so could not read your answer properly – overexchange Dec 26 '14 at 18:51
  • @pbabcdefp One supple question, Did I understand the second contract very well, as mentioned in query? – overexchange Jan 28 '15 at 07:11
1

hashCode() does not have to produce a distinct result. return 0; is a perfectly legal implementation of hashCode() - it ensures that two equal objects will have the same hash code. But it will ensure dismal performance when using HashMaps and HashSets.

It's preferable that hashCode() return values will be distinct (i.e., objects that are not equal should have different hash codes), but it's not required.

Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • here you are breaking second contract ): – overexchange Dec 26 '14 at 14:53
  • @overexchange No, you are not. If they are equal, the hash codes are equal. The second point doesn't say what should happen if they are not equal. – milleniumbug Dec 26 '14 at 14:53
  • @overexchange what milleniumbug said - if all the objects have a hashCode of 0, then equal objects have the same hashCode. The fact that unequal objects also have the same hashCode is inconsequential with regard to the rules. – Mureinik Dec 26 '14 at 14:58
1

The second contract states what happens when equals() returns true. It does not say anything about the case when equals() returns false.

The third contract is just a reminder about that fact. It reminds you that when equals() is false for two objects, there is no connection between their hash codes. They may be same or different, as the implementation happens to make them.

  • Arkady, What am saying is, i wonder, Why contract#2 missed saying that when `equals()` returns false then `hashCode()` must return different integer? because when you override `equals()` method you will follow any of the four levels of equality operations mentioned in the query(a/b/c/d), so `hashCode()` method should also follow that logic accordingly. – overexchange Dec 26 '14 at 15:27
  • Read #2 carefully. It says precisely nothing about what happens when equals() is false. It only talks about "If two objects are equal according to the equals(Object) method". What happens in the opposite case is not discussed. –  Dec 26 '14 at 15:30
  • To understand *why* #2 does not specify `!=` case, read the rest of the answers. Basically, `int` is not large enough to ensure that every possible value of every possible type has a distinct number. So Java architects avoid the impossible task by leaving `!=` case unspecified. –  Dec 26 '14 at 15:39
  • OK. so as `int` is not so large and number of objects are more than "int.max" then I can clearly say that `hashCode()` must rely on `long` in that case. if u try that, Does java allow you to say(override), `public long hashCode(){}` then how would you want to attack this problem? first of all how can two new objects can have same address in more than 2^31-1 case, they must be different? only possible answer that i can think of is java has to say the limit on the number of objects that can be created. – overexchange Dec 26 '14 at 15:50
  • You're thinking "all objects currently allocated in the memory". What you should be thinking is "all possible objects of a given type". Consider strings. How many possible strings of length 1M character are there? How can you ensure that every two such strings have different hash codes? –  Dec 26 '14 at 16:11
  • @Arkadiy: Part of the contract for `string` is that its hash code must be computed a certain way based upon the character sequence encapsulated thereby. If a type were allowed to have the first call to `getHashCode()` for a type encapsulating a particular value vary depending upon what objects had previously been asked for hash codes, it would be possible (though often impractical) to ensure that any two objects of a type which exist simultaneously and encapsulate different state will yield different hash values unless more than 4 billion objects exist simultaneously. – supercat Jan 03 '15 at 20:31
0

The third point means that you can have many unequal objects with the same hashcode. . For example 2 string objects can have the same hashcode. The second point states that two equal objects must have the same hashcode. . return 5 is a valid hash implementation because it returns the same value for 2 equal objects.

TheLostMind
  • 35,966
  • 12
  • 68
  • 104
  • `hashCode()` of `class String` works like this: `s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]` and `equals()` of `class String` works like this: `true if same sequence of characters as this object` so we are following contract#2 with "deep stuctural equality" in both methods. so where is the chance to think of contract#3? What does it mean when joshua says: `The key provision that is violated when you fail to override hashCode is the second(contract) one` – overexchange Dec 26 '14 at 15:11
  • @ over exchange : what about the case of s1= new String ("abc") and s2= new String("abc") ?. They are 2 distinct objects with same hashcode. Joshua is trying to say - "the second contract must hold good for objects.. and the way it holds good is by overriding hashcode. – TheLostMind Dec 26 '14 at 15:48