7

If i will override hashCode() method will it degrade the performance of application. I am overriding this method in many places in my application.

Henry
  • 42,982
  • 7
  • 68
  • 84
Subodh Joshi
  • 12,717
  • 29
  • 108
  • 202
  • 3
    overriding doesn't add much overhead, if at all. The efficiency of your hashCode() implementation is up to what it has to do and how you write it. You could cache the value if it's important to you. – Peter Lawrey Aug 12 '13 at 06:47
  • 3
    I would pay more attention to program correctness than performance. Whenever you override `equals` you must override `hashCode` as well. When `a.equals(b)` returns `true` it must also hold that `a.hashCode()==b.hashCode()` – Henry Aug 12 '13 at 06:53

8 Answers8

7

Yes you can degrade the performance of a hashed collection if the hashCode method is implemented in a bad way. The best implementation of a hashCode method should generate the unique hashCode for unique objects. Unique hashCode will avoid collisions and an element can be stored and retrieved with O(1) complexity. But only hashCode method will not be able to do it, you need to override the equals method also to help the JVM.

If the hashCode method is not able to generate unique hash for unique objects then there is a chance that you will be holding more than one objects at a bucket. This will occur when you have two elements with same hash but equals method returns false for them. So each time this happens the element will be added to the list at hash bucket. This will slow down both the insertion and retreival of elements. It will lead to O(n) complexity for the get method, where n is the size of the list at a bucket.

Note: While you try to generate unique hash for unique objects in your hashCode implementation, be sure that you write simple algorithm for doing so. If your algorithm for generating the hash is too heavy then you will surely see a poor performance for operations on your hashed collection. As hashCode method is called for most of the operations on the hashed collection.

Juned Ahsan
  • 67,789
  • 12
  • 98
  • 136
3

It would improve performance if the right data structure used at right place,

For example: a proper hashcode implementation in Object can nearly convert O(N) to O(1) for HashMap lookup

unless you are doing too much complicated operation in hashCode() method

It would invoke hashCode() method every time it has to deal with Hash data structure with your Object and if you have heavy hashCode() method (which shouldn't be)

jmj
  • 237,923
  • 42
  • 401
  • 438
3

It depends entirely on how you're implementing hashCode. If you're doing lots of expensive deep operations, then perhaps it might, and in that case, you should consider caching a copy of the hashCode (like String does). But a decent implementation, such as with HashCodeBuilder, won't be a big deal. Having a good hashCode value can make lookups in data structures like HashMaps and HashSets much, much faster, and if you override equals, you need to override hashCode.

chrylis -cautiouslyoptimistic-
  • 75,269
  • 21
  • 115
  • 152
  • 3
    It can if you do it wrong, but not doing it at all can be even worse. Write a basic `hashCode` using `HashCodeBuilder` and then profile if you think you're having performance problems. Don't waste time and effort (and introduce bugs!) solving a problem you think you have but haven't actually measured. – chrylis -cautiouslyoptimistic- Aug 12 '13 at 06:45
3

Java's hashCode() is a virtual function anyway, so there is no performance loss by the sheer fact that it is overridden and the overridden method is used.

The real difference may be the implementation of the method. By default, hashCode() works like this (source):

As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

So, whenever your implementation is as simple as this, there will be no performance loss. However, if you perform complex computing operations based on many fields, calling many other functions - you will notice a performance loss but only because your hashCode() does more things.

There is also the issue of inefficient hashCode() implementations. For example, if your hashCode() simply returns value 1 then the use of HashMap or HashSet will be significantly slower than with proper implementation. There is a great question which covers the topic of implementing hashCode() and equals() on SO: What issues should be considered when overriding equals and hashCode in Java?

One more note: remember, that whenever you implement hashCode() you should also implement equals(). Moreover, you should do it with care, because if you write an invalid hashCode() you may break equality checks for various collections.

Community
  • 1
  • 1
Dariusz
  • 21,561
  • 9
  • 74
  • 114
3

Overriding hashCode() in a class in itself does not cause any performance issues. However when an instance of such class is inserted either into a HashMap HashSet or equivalent data structure hashCode() & optionally equals() method is invoked to identify right bucket to put the element in. same applicable to Retrival Search & Deletion.

As posted by others performance totally depends on how hashCode() is implemented. However If a particular class's equals method is not used at all then it is not mandatory to override equals() and hashCode() , but if equals() is overridden , hashcode() must be overridden as well

Prashant Bhate
  • 10,907
  • 7
  • 47
  • 82
1

As all previous comments mentioned, hash-code is used for hashing in collections or it could be used as negative condition in equals. So, yes you can slow you app a lot. Obviously there is more use-cases.

First of all I would say that the approach (whether to rewrite it at all) depends on the type of objects you are talking about.

  1. Default implementation of hash-code is fast as possible because it's unique for every object. It's possible to be enough for many cases.
  2. This is not good when you want to use hashset and let say want to do not store two same objects in a collection. Now, the point is in "same" word.

"Same" can mean "same instance". "Same" can mean object with same (database) identifier when your object is entity or "same" can mean the object with all equal properties. It seems that it can affect performance so far.

But one of properties can be a object which could evaluate hashCode() on demand too and right now you can get evaluation of object tree's hash-code always when you call hash-code method on the root object.

So, what I would recommend? You need to define and clarify what you want to do. Do you really need to distinguish different object instances, or identifier is crucial, or is it value object?

It also depends on immutability. It's possible to calculate hashcode value once when object is constructed using all constructor properties (which has only get) and use it always when hashcode() is call. Or another option is to calculate hashcode always when any property gets change. You need to decide whether most cases read the value or write it.

The last thing I would say is to override hashCode() method only when you know that you need it and when you know what are you doing.

Martin Podval
  • 1,097
  • 1
  • 7
  • 16
1

The main purpose of hashCode method is to allow an object to be a key in the hash map or a member of a hash set. In this case an object should also implement equals(Object) method, which is consistent with hashCode implementation:

If a.equals(b) then a.hashCode() == b.hashCode()

If hashCode() was called twice on the same object, it should return the same result provided that the object was not changed

hashCode from the performance point of view

  • From the performance point of view, the main objective for your hashCode method implementation is to minimize the number of objects sharing the same hash code.
  • All JDK hash based collections store their values in an array.
  • Hash code is used to calculate an initial lookup position in this array. After that equals is used to compare given value with values stored in the internal array. So, if all values have distinct hash codes, this will minimize the possibility of hash collisions.
  • On the other hand, if all values will have the same hash code, hash map (or set) will degrade into a list with operations on it having O(n2) complexity.
  • From Java 8 onwards though collision will not impact performance as much as it does in earlier versions because after a threshold the linked list will be replaced by the binary tree, which will give you O(logN) performance in the worst case as compared to O(n) of linked list.
  • Never write a hashCode method which returns a constant.
  • String.hashCode results distribution is nearly perfect, so you can sometimes substitute Strings with their hash codes.

The next objective is to check how many identifiers with non-unique has codes you still have. Improve your hashCode method or increase a range of allowed hash code values if you have too many of non-unique hash codes. In the perfect case all your identifiers will have unique hash codes.

amitkumar12788
  • 797
  • 3
  • 10
  • 14
0

If you will override hashCode() method will it degrade the performance of application.It would improve performance if the right data structure used at right place,

For example: a proper hashcode() implementation in Object can nearly convert O(N) to O(1) for HashMap lookup.unless you are doing too much complicated operation in hashCode() method

Mohsin Shaikh
  • 494
  • 1
  • 4
  • 11