Algorithm used for bucket lookup for hashcodes

Question

In most cases, HashSet has lookup complexity O(1). I understand that this is because objects are kept in buckets corresponding to hashcodes of the object.

When lookup is done, it directly goes to the bucket and finds (using equals if many objects are present in same bucket) the element.

I always wonder, how it directly goes to the required bucket? Which algorithm is used for bucket lookup? Does that add nothing to total lookup time?

The word is 'algorithm'. It is derived from a proper name. Abbreviation not permissible, indeed meaningless. — user207421, Feb 25 '15 at 07:11
its basically accessing a index i from array. so no look up time to reach index i. i is calulated from hashcode%array.length ; and hashcode is being calculated from has function. although it takes some amount of constant time and O(1) signifies that it doesen't depends on the size of elements. — anurag gupta, Feb 25 '15 at 07:17
each answer giving some more insight!! I can't accept them all :( — codingenious, Feb 25 '15 at 07:21

Radiodef · Answer 1 · 2015-02-25T07:25:26.523

4

I always wonder, how it directly goes to the required bucket?

The hash code is treated and used as an index in to an array.

The index is determined by hash & (array.length - 1) because the length of the Java HashMap's internal array is always a power of 2. (This a cheaper computation of hash % array.length.)

Each "bucket" is actually a linked list (and now, possibly a tree) where entries with colliding hashes are grouped. If there are collisions, then a linear search through the bucket is performed.

Does that add nothing to total lookup time?

It incurs the cost of a few loads from memory.

edited Feb 25 '15 at 07:25

answered Feb 25 '15 at 07:19

Radiodef

37,180
14
90
125

[`HashMap#hash`](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b132/java/util/HashMap.java#HashMap.hash%28java.lang.Object%29) also mangles the hash to distribute the bits better – Niklas B. Feb 25 '15 at 09:47

score 2 · Answer 2 · answered Feb 25 '15 at 07:15

2

Often, the algorithm is simply

hash = hashFunction(key)
index = hash % arraySize

See the wikipedia article on Hash Table for details.

answered Feb 25 '15 at 07:15

NamshubWriter

23,549
2
41
59

Khanna111 · Accepted Answer · 2015-02-25T07:35:37.800

1

From memory: the HashSet is actually backed by a HashMap and the basic look up process is:

Get the key
hash it (hashcode())
hashcode % the number of buckets
Go to that bucket and evaluate equals()

For a Set there would only be unique elements. I would suggest reading the source for HashSet and it should be able to answer your queries.

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java#HashMap.containsKey%28java.lang.Object%29

Also note that the Java 8 code has been updated and this explanation covers pre Java 8 codebase. I have not examined in detail the Java 8 implementation except to figure out that it is different.

edited Feb 25 '15 at 07:35

answered Feb 25 '15 at 07:17

Khanna111

3,627
1
23
25

It is not true that in a set there can be only one element per bucket. If you have a clash (two different elements with the same hash), they would end up in the same bucket. In the source code you linked you can even see that every bucket has a linked list of `Entry` elements. – Vincent van der Weele Feb 25 '15 at 07:29
You are right, what I meant was that there would be only unique keys in the set and no duplicates. Updated. Thanks. – Khanna111 Feb 25 '15 at 07:34

Algorithm used for bucket lookup for hashcodes

3 Answers3