2

I read that into a hash table we have a bucket array but I don't understand what that bucket array contains.

Does it contain the hashing index? the entry (key/value pair)? both?

This image, for me, is not very clear:

(reference)

So, which is a bucket array?

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
xdevel2000
  • 20,780
  • 41
  • 129
  • 196
  • why don't you open the implementation/source code of HashTable/HashMap and look at it?? .. http://docs.oracle.com/javase/7/docs/api/java/util/Hashtable.html – TheLostMind Jan 21 '14 at 14:28

5 Answers5

1

What goes into the bucket array depends a lot on what is stored in the hash table, and also on the collision resolution strategy.

When you use linear probing or another open addressing technique, your bucket table stores keys or key-value pairs, depending on the use of your hash table *.

When you use a separate chaining technique, then your bucket array stores pairs of keys and the headers of your chaining structure (e.g. linked lists).

The important thing to remember about the bucket array is that it establishes a mapping between a hash code and a group of zero or more keys. In other words, given a hash code and a bucket array, you can find out, in constant time, what are the possible keys associated with this hash code (enumerating the candidate keys may be linear, but finding the first one needs to be constant time in order to meet hash tables' performance guarantee of amortized constant time insertions and constant-time searches on average).

* If your hash table us used for checking membership (i.e. it represents a set of keys) then the bucket array stores keys; otherwise, it stores key-value pairs.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1

The array index is mostly equivalent to the hash value (well, the hash value mod the size of the array), so there's no need to store that in the array at all.

As to what the actual array contains, there are a few options:

  • If we use separate chaining:

    • A reference to a linked-list of all the elements that have that hash value. So:

      LinkedList<E>[]
      
    • A linked-list node (i.e. the head of the linked-list) - similar to the first option, but we instead just start off with the linked-list straight away without wasting space by having a separate reference to it. So:

      LinkedListNode<E>[]
      
  • If we use open addressing, we're simply storing the actual element. If there's another element with the same hash value, we use some reproducible technique to find a place for it (e.g. we just try the next position). So:

    E[]
    
  • There may be a few other options, but the above are the best-known, with separate-chaining being the most popular (to my knowledge)

* I'm assuming some familiarity with generics and Java/C#/C++ syntax - E here is simply the type of the element we're storing, LinkedList<E> means a LinkedList storing elements of type E. X[] is an array containing elements of type X.

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
  • So we can say that with the index returned from the hash function we find the entry stored into the bucket array... – xdevel2000 Jan 21 '14 at 14:55
  • @xdevel2000 yes but you don't normally use it directly, normally you use `hashCode % buckets.Length` to find the index. – Scott Chamberlain Jan 21 '14 at 15:01
  • @xdevel2000 Separate chaining will return a **list** of elements matching the hash value (mod the array size), while, with open addressing, you may have to look around a bit to find the correct element (yes, that's a bit vague, but clarifying it would require an extensive explanation). For separate chaining, the list at the index = the hash value of an element will contain that element. And, for open addressing, we'll also find it by starting to look at that index. I would not say that your statement holds entirely true, as, at that index, SC has a list and OA not necessarily that element. – Bernhard Barker Jan 21 '14 at 15:20
0

In practice a linked list of the entries that have been computed (by hashing the key) to go into that bucket.

Kayaman
  • 72,141
  • 5
  • 83
  • 121
0

In a HashTable there are most of the times collisions. That is when different elements have the same hash value. Elements with the same Hash value are stored in one bucket. So for each hash value you have a bucket containing all elements that have this hash-value.

MrSmith42
  • 9,961
  • 6
  • 38
  • 49
0

A bucket is a linked list of key-value pairs. hash index is the one to tell "which bucket", and the "key" in the key-value pair is the one to tell "which entry in that bucket". also check out hashing in Java -- structure & access time, i've bee telling more details there.

Community
  • 1
  • 1
Roam
  • 4,831
  • 9
  • 43
  • 72