6

I am looking at the Item[TKey] source of Dictionary<TKey,TValue> trying to understand the mechanism of storage/retrieval in a dictionary, and why it is faster than just checking each entry one by one.

Where I get confused is in the user of prime numbers in buckets field and the interplay with Entry<TKey,TValue>.next.

Can someone explain to me the logic, or point to a reference where I can understand it.

Thanks.

John Alexiou
  • 28,472
  • 11
  • 77
  • 133
  • HashTable + Separate chaining http://en.wikipedia.org/wiki/Hash_table#Separate_chaining – Ani Dec 16 '10 at 14:44
  • 1
    just ignore the "prime number" part for the moment (it is some number-theory-based optimization), and look at the graphs in the /hash table/ wikipedia page. – J-16 SDiZ Dec 16 '10 at 14:53

3 Answers3

5

See Hash Tables and possibly this article about using prime numbers in hash table implementations.

Brian Genisio
  • 47,787
  • 16
  • 124
  • 167
  • Wow, I had no idea. Hashtables seem to be the product of a many PhD thesis from countless of CS majors to find what works best. – John Alexiou Dec 16 '10 at 15:03
5

Look at this Wikipedia article: Hashtable

A dictionary is just a strongly-typed implementation of a hashtable. It has a number of "buckets" where it puts the items with their keys.

When you add an item to the hashtable, it uses the hashcode of the key to decide in which bucket it's going to put it (that's normally a very fast operation, just call GetHashCode and apply a modulo to it). Once it has the bucket (which is some kind of list), it checks if the bucket already contains an item with the same key, and adds it if it's not the case. This is also pretty fast, because each bucket contains only a small subset of all the items in the hashtable.

When you want to retrieve an item based on its key, it determines the bucket based on the hashcode of the key, and looks for an item with this key in the bucket.

Of course, this is a very simplistic description... look at the Wikipedia article for more details.

Thomas Levesque
  • 286,951
  • 70
  • 623
  • 758
0

Look into hashcodes, they allow the dictionary to 'bucket' the keys:

http://msdn.microsoft.com/en-us/library/4yh14awz%28VS.80%29.aspx

Colin E.

ColinE
  • 68,894
  • 15
  • 164
  • 232