1

I am learning Java collection framework in Java, and got fair idea of various Classes and interfaces.

While going through Set interface, one of the implementations is HashSet (among others).

I am not able to understand what is the logic of implementing Set based on the Hash, what advantage does it serve?

Can anyone help me understand what is the need of Hash based implementation of Set in Java Collection Framework?

GhostCat
  • 137,827
  • 25
  • 176
  • 248
CuriousMind
  • 8,301
  • 22
  • 65
  • 134
  • 2
    This gives great speed. Operation which set do the most is `contains` where it needs to check if set already has element equal to the one which we want to add. Letting set organize elements in groups based on their hashes eliminate the need to check groups with different hash than of element which we are testing. So such set don't need to iterate over all elements, but only on those with *similar* hash code. – Pshemo Dec 15 '17 at 15:18
  • @Pshemo: But we also know that two different elements may lead to same hash values; how is this uniqueness ensured then? Where can we get more info? – CuriousMind Dec 15 '17 at 15:20
  • 1
    For each key in the HashSet a list of items is maintained. If you get a hash collision both items are added to the list. The equals() method on the item is then used to determine which one you want. https://stackoverflow.com/questions/12909325/hashset-collisions-in-java – bhspencer Dec 15 '17 at 15:22
  • 1
    I added clarification to my previous comment. In short when invoking `set.add(x)` set get `x.hashcode()` and now it knows which group of elements it should iterate, to look for equal element. It can skip iterating elements from other groups because it already knows that they have different hashes so they can't be equal to `x` element. So if we have 4 groups equally populated then it speeds process 4 times because it needs to iterate max 1/4 elements. If we have 256 groups equally populated then performance also increases ~256 times (compared to iterating over all elements). – Pshemo Dec 15 '17 at 15:25

1 Answers1

5

Simple: performance and efficiency.

The key element of Sets is: uniqueness. You want to quickly be able to tell wether some x is in (x, y, z) or not.

And hashing is a very elegant and efficient way of approaching this problem.

That is all there is to this.

GhostCat
  • 137,827
  • 25
  • 176
  • 248