16

I've read in "The Go Programming Language" that a "given key can be retrieved ... using a constant number of key comparisons on average, no matter how large the hash table." I'm not sure what that means in terms of its implementation internally though. Does that mean it searches through every key until it finds a match or is some type of binary (or other) search algorithm used internally?

For example, if I have a map with 2,000 keys, does it "on average" need to look at 1,000 to find a match or does it look at only 11 (log2 n) as it would with binary search?

Nakilon
  • 34,866
  • 14
  • 107
  • 142
user21293
  • 6,439
  • 11
  • 44
  • 57

3 Answers3

40

Maps are implemented as hash tables. There are plenty of places that explain hashing; Here's a nice visualization you can run.

One nice feature of Go is that

  • the source is available on github, and
  • it is pretty well written and documented, so it's not too hard to understand.

From the source file for hashmap:

// A map is just a hash table. The data is arranged
// into an array of buckets. Each bucket contains up to
// 8 key/value pairs. The low-order bits of the hash are
// used to select a bucket. Each bucket contains a few
// high-order bits of each hash to distinguish the entries
// within a single bucket.
//
// If more than 8 keys hash to a bucket, we chain on
// extra buckets.

https://github.com/golang/go/blob/master/src/runtime/map.go

One thing that you can learn from this code that is not normally covered in a lot of classes is how not to invalidate iterators when the map is resized internally.

Mark Harrison
  • 297,451
  • 125
  • 333
  • 465
9

The native map type uses a hash table implementation. It uses a hashing function on the key to generate an index into an array of data. Thus, generally, most actions occur in O(1) time. This is only generally true as some keys can result in the same index when hashed, called a collision, which then must be handled specially.

Hash tables are cool!

Austin Hanson
  • 21,820
  • 6
  • 35
  • 41
4

Overview Map

enter image description here


bucket details of map[string]string

enter image description here

Each bucket stores

  • an array of hash codes
    • The array of hash codes stores the hash code for each key in the bucket, this is used for faster comparison.
  • a list of key-value pairs
    • Each bucket can store a maximum of 8 such key-value pairs.
  • an overflow pointer.
    • The overflow pointer can point to a new bucket if a bucket receives more than 8 values.

What happens when we insert a new value in the map

Hash function is called and a hash code is generated for the given key, based on a part of the hash code a bucket is determined to store the key-value pair. Once the bucket is selected the entry needs to be held in that bucket. The complete hash code of the incoming key is compared with all the hashes from the initial array of hash codes i.e h1, h2... if no hash code matches that means this is a new entry. Now if the bucket contains an empty slot then the new entry is stored at the end of the list of key-value pairs, else a new bucket is created and the entrance is stored in the new bucket, and the overflow pointer of the old bucket points to this new bucket.

What happens when the map grows

Every time the number of elements in a bucket reaches a certain limit, i.e the load factor which is 6.5, the map will grow in size by doubling the number of buckets. This is done by creating a new bucket array consisting of twice the number of buckets than the old array, and then copying all the buckets from the old to a new array but this is done very efficiently over time and not at once. During this, the map maintains a pointer to the old bucket which is stored at the end of the new bucket array.

What happens when we delete an entry from the map

Maps in golang only grow in size. Even if we delete all the entries from the map, the number of buckets will remain the same


Source:

zangw
  • 43,869
  • 19
  • 177
  • 214