I know that this is an implementation detail, but I'm curious: Is there a bound on the number of hash buckets used in a .NET Dictionary? I assume that it will be somewhere around 2 * numberOfElements, but does anyone know for sure (or is it documented anywhere)?
Asked
Active
Viewed 3,238 times
8
-
1Did you try to look into the source? – Patrick Hofman Jun 23 '14 at 12:52
-
http://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs#602c049c4edafd6d#references – Patrick Hofman Jun 23 '14 at 12:53
-
1Take a look on `HashHelpers` class – Sergey Berezovskiy Jun 23 '14 at 12:54
-
http://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs#dc94bb2ee9650189#references – Habib Jun 23 '14 at 12:55
-
1The source has these two lines `int size = HashHelpers.GetPrime(capacity); buckets = new int[size];` – Habib Jun 23 '14 at 12:56
-
Are you asking how many _can_ it have or how many _does_ it have? – D Stanley Jun 23 '14 at 12:57
-
@PatrickHofman: No. I thought I'd ask on SO first, just in case my Google skills missed some documentation or some development blog post where this is explained. – Heinzi Jun 23 '14 at 13:00
-
@DStanley: Let me put it this way: If I store *n* elements in a dictionary, what are the *f_1(n)*, *f_2(n)* such that *f_1(n) <= numberOfBuckets <= f_2(n)*. – Heinzi Jun 23 '14 at 13:01
-
@Downvoter: I consider this a legitimate and interesting question -- and I'll be glad for any improvement suggestions. – Heinzi Jun 23 '14 at 13:02
-
In short: it uses a size equal to the first prime number greater than [numberOfElements]. However it does not consider every prime number. It uses a table for sizes upto a limit, and for bigger sizes it computes a prime number the hard way. – Dennis_E Jun 23 '14 at 13:06
1 Answers
11
In short: it uses a size equal to the first prime number greater than [numberOfElements]. However it does not consider every prime number: it uses a table for sizes upto a limit, and for bigger sizes it computes a prime number the hard way.
If you look in the source, you can spot the following table in a class called HashHelpers: (I guess this means you need 7.2 million items before it starts computing prime numbers)
public static readonly int[] primes = {
3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107, 131, 163, 197, 239, 293, 353, 431, 521, 631, 761, 919,
1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143, 14591,
17519, 21023, 25229, 30293, 36353, 43627, 52361, 62851, 75431, 90523, 108631, 130363, 156437,
187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403, 968897, 1162687, 1395263,
1674319, 2009191, 2411033, 2893249, 3471899, 4166287, 4999559, 5999471, 7199369};

Dennis_E
- 8,751
- 23
- 29
-
3Actually, a lot of primes are missing. 2, 5, 13, 19, 31, etc. It looks like it just uses primes that are sufficiently far apart, since it's not worth it to worry about all the intermediate ones. – Kendall Frey Jun 23 '14 at 13:17
-
@KendallFrey I suppose so. There are A LOT of primes. Storing them all would be crazy. It doesn't matter much if the capacity is 17 or 19. But they're not doing it in the most efficient way, though: http://referencesource.microsoft.com/#mscorlib/system/collections/hashtable.cs#e8668bf19da49963 – Dennis_E Jun 23 '14 at 13:24