52

Recently, in an interview I was asked, what exactly is a bucket in hashmap? Whether it is an array or a arraylist or what?

I got confused. I know hashmaps are backed by arrays. So can I say that bucket is an array with a capacity of 16 in the start storing hashcodes and to which linked lists have their start pointer ?

I know how a hashmap internally works, just wanted to know what exactly is a bucket in terms of data structures.

dgupta3091
  • 1,067
  • 1
  • 7
  • 18
  • 4
    you need to read this (http://stackoverflow.com/questions/6493605/how-does-a-java-hashmap-handle-different-objects-with-the-same-hash-code/6493946#6493946) – bananas Jun 22 '16 at 06:33
  • 1
    @JonnyHenly : I specifically wanted to know what a bucket is? In the question mentioned, it is more of working on the hashcodes and hashmap implementation. So I don't consider my question to be a duplicate. The questions might be similar, but the answer they are looking for are different. – dgupta3091 Jun 22 '16 at 06:40

5 Answers5

45

No, a bucket is each element in the array you are referring to. In earlier Java versions, each bucket contained a linked list of Map entries. In new Java versions, each bucket contains either a tree structure of entries or a linked list of entries.

From the implementation notes in Java 8:

/*
 * Implementation notes.
 *
 * This map usually acts as a binned (bucketed) hash table, but
 * when bins get too large, they are transformed into bins of
 * TreeNodes, each structured similarly to those in
 * java.util.TreeMap. Most methods try to use normal bins, but
 * relay to TreeNode methods when applicable (simply by checking
 * instanceof a node).  Bins of TreeNodes may be traversed and
 * used like any others, but additionally support faster lookup
 * when overpopulated. However, since the vast majority of bins in
 * normal use are not overpopulated, checking for existence of
 * tree bins may be delayed in the course of table methods.
 ...
Eran
  • 387,369
  • 54
  • 702
  • 768
  • 2
    So when we say hashmap has a capacity of 16 in the start, so at that time it creates an array of 16 space and has its each element called as a bucket.? – dgupta3091 Jun 22 '16 at 06:19
  • 2
    @dgupta3091 yes, though in Java 8 implementation the array is created lazily (i.e. only when the first entry is put in the HashMap). – Eran Jun 22 '16 at 06:22
  • 1
    @JonnyHenly I just checked the implementation in Java 8, and none of the constructors initialize the array. It is only initialized by `resize()` (which will be called by `put` if the array is null) and `readObject(java.io.ObjectInputStream s)` (deserialization). – Eran Jun 22 '16 at 06:31
  • @Eran That makes sense, I forgot the main reason to set an initial capacity is to minimize the number of rehash operations. I just think it's odd to create the array lazily, for instance - you create a hashmap with a rather large initial capacity in a loading method, thinking it will take some time. Then later down the road you find out the first call to put takes longer than the loading function. Perhaps my logic is off, I am pretty tired. Thanks for responding to my comment though. – Jonny Henly Jun 22 '16 at 07:08
  • 2
    @Konstantin yes, each bucket is an instance of `java.util.HashMap.Node` – Eran Aug 06 '20 at 08:15
30

bucket

I hope this may help you to understand the implementation of hash map well.

Arun
  • 3,701
  • 5
  • 32
  • 43
  • 1
    Please add some textual explanation to diagram, e.g. how/why a key in the bucket #1 could have greater value than a key in bucket #2. – WebViewer Oct 27 '22 at 08:34
10

Buckets exactly is an array of Nodes. So single bucket is an instance of class java.util.HashMap.Node. Each Node is a data structure similar to LinkedList, or may be like a TreeMap (since Java 8), HashMap decides itself what is better for performance--keep buckets as LinkedList or TreeMap. TreeMap will be only chosen in case of poorly designed hashCode() function, when lots of entries will be placed in single bucket. See how buckets look like in HashMap:

/**
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     */
    transient Node<K,V>[] table;
stinger
  • 3,790
  • 1
  • 19
  • 30
0

Hashmap Bucket is where multiple nodes can store and nodes where hashmap object store based on index calculation and every nodes connected based on linkedlist architecture.

vishal thakur
  • 609
  • 6
  • 7
-2

Buckets are basically a data structure that is being used in the Paging algorithm of the Operating System . To be in a very Laymans language.

The objects representing a particular hashcode is being stored in that bucket.(basically you can consider the header of the linked list data structure to be the hashcode value which is represented in the terms of bucket)

The references of the object is being stored in the link list , whose header represents the value of the Hashcode.

The JVM creates them and the size, depends upon the memory being allocated by the JVM.