3

Related to this:

System.OutOfMemoryException because of Large Dictionary

I am getting OutOfMemoryException when I let .NET manage my dictionary. The exception thrown in resize method. So I was trying to avoid resizing by providing large number as a capacity. Naturally I tried to start from int.MaxValue just to make sure it will help to solve the problem. I was planning to go down as much as needed.

However OutOfMemoryException is thrown for int.MaxValue. I decided to make a little binary search and the very first value for capacity that was accepted by constructor turned to be int.MaxValue / 32. Not only that is smaller than I have expected but it is also just confusing.

So has anyone have any ideas why?

Oh, the project settings are setup to use x64 architecture.

Community
  • 1
  • 1
Schultz9999
  • 8,717
  • 8
  • 48
  • 87
  • In 64-bit .NET prior to .NET 4.5, the maximum number of items in a dictionary (using a reference for a key and a reference for the value) is 89,478,457. That's an experimentally derived result. See [my blog entry](http://blog.mischel.com/2008/05/21/more-on-net-collection-sizes/) and linked articles. – Jim Mischel Aug 21 '13 at 18:33
  • @Jim `int.MaxValue / 24 = 89478485.3333...`, just 32 1/3 higher than your value. – ErikE Jan 06 '16 at 00:08

2 Answers2

10

The hard limit for Dictionary<TKey, TValue> is due to the private field entries, which has the type Entry[]. In this case Entry is a struct:

private struct Entry {
  public int hashCode;
  public int next;
  public TKey key;
  public TValue value;
}

The minimum size of Entry is 4*4, which could happen if TKey and TValue are both 4-byte types (referring to the array alignment size, i.e. the value returned by the CLI sizeof bytecode instruction). This results in a limit of int.MaxValue / (2*16) entries due to the array size limitation in .NET. If you use a TKey or TValue type which is larger than 4 bytes, the maximum dictionary size would decrease accordingly.

For large dictionaries like you are suggesting, a B+-tree implementation of IDictionary<TKey, TValue> would be a more efficient use of system resources and would not be subject to the array size limitation.

Community
  • 1
  • 1
Sam Harwell
  • 97,721
  • 20
  • 209
  • 280
  • I guess they used a struct to make it faster. If they'd used a class, then the limit would be a lot higher. – Matthew Watson Aug 21 '13 at 17:32
  • @MatthewWatson It's a general purpose type. Item count > 67,108,863 doesn't sound like general purpose use to me. – asawyer Aug 21 '13 at 17:34
  • @MatthewWatson Which is obviously the wise trade-off to make for the most frequently used implementation of `IDictionary`. This is also the reason why none of the methods in `Dictionary` are virtual. :) – Sam Harwell Aug 21 '13 at 17:34
  • Absolutely! I certainly wasn't suggesting that it was a mistake! – Matthew Watson Aug 21 '13 at 17:38
  • @280Z28 This makes a lot of sense and is supported by my observation. I have actually ~73M records to handle. Key is ULONG and value is 4x1byte. While I was waiting for the answer, I was able to make things barely work with `int.MaxValue/28`. – Schultz9999 Aug 21 '13 at 17:44
2

Just to add to 280Z28's answer:

If you really must have such a large dictionary, AND you are running on x64, AND you are using .Net 4.5 or later, then you can enable large object support. That will allow a much larger dictionary.

In my tests, I could specify a capacity up to int.MaxValue/2

To do so, add to your "app.config" file a section as follows:

<runtime>
  <gcAllowVeryLargeObjects enabled="true" />
</runtime>

So your entire "app.config" file might look like this:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
    <startup> 
        <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5" />
    </startup>
  <runtime>
    <gcAllowVeryLargeObjects enabled="true" />
  </runtime>
</configuration>

See here for more details: http://msdn.microsoft.com/en-us/library/hh285054.aspx

Matthew Watson
  • 104,400
  • 10
  • 158
  • 276