3

In most cases, HashSet has lookup complexity O(1). I understand that this is because objects are kept in buckets corresponding to hashcodes of the object.

When lookup is done, it directly goes to the bucket and finds (using equals if many objects are present in same bucket) the element.

I always wonder, how it directly goes to the required bucket? Which algorithm is used for bucket lookup? Does that add nothing to total lookup time?

user207421
  • 305,947
  • 44
  • 307
  • 483
codingenious
  • 8,385
  • 12
  • 60
  • 90

3 Answers3

4

I always wonder, how it directly goes to the required bucket?

The hash code is treated and used as an index in to an array.

The index is determined by hash & (array.length - 1) because the length of the Java HashMap's internal array is always a power of 2. (This a cheaper computation of hash % array.length.)

Each "bucket" is actually a linked list (and now, possibly a tree) where entries with colliding hashes are grouped. If there are collisions, then a linear search through the bucket is performed.

Does that add nothing to total lookup time?

It incurs the cost of a few loads from memory.

Radiodef
  • 37,180
  • 14
  • 90
  • 125
  • [`HashMap#hash`](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b132/java/util/HashMap.java#HashMap.hash%28java.lang.Object%29) also mangles the hash to distribute the bits better – Niklas B. Feb 25 '15 at 09:47
2

Often, the algorithm is simply

hash = hashFunction(key)
index = hash % arraySize

See the wikipedia article on Hash Table for details.

NamshubWriter
  • 23,549
  • 2
  • 41
  • 59
1

From memory: the HashSet is actually backed by a HashMap and the basic look up process is:

  • Get the key
  • hash it (hashcode())
  • hashcode % the number of buckets
  • Go to that bucket and evaluate equals()

For a Set there would only be unique elements. I would suggest reading the source for HashSet and it should be able to answer your queries.

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java#HashMap.containsKey%28java.lang.Object%29

Also note that the Java 8 code has been updated and this explanation covers pre Java 8 codebase. I have not examined in detail the Java 8 implementation except to figure out that it is different.

Khanna111
  • 3,627
  • 1
  • 23
  • 25
  • It is not true that in a set there can be only one element per bucket. If you have a clash (two different elements with the same hash), they would end up in the same bucket. In the source code you linked you can even see that every bucket has a linked list of `Entry` elements. – Vincent van der Weele Feb 25 '15 at 07:29
  • You are right, what I meant was that there would be only unique keys in the set and no duplicates. Updated. Thanks. – Khanna111 Feb 25 '15 at 07:34