18

I'm running up against the 2gb object limit in c# (this applies even in 64 bit for some annoying reason) with a large collection of structs (est. size of 4.2 gig in total).

Now obviously using List is going to give me a list of size 4.2gb give or take, but would using a list made up of smaller lists, which in turn contain a portion of the structs, allow me to jump this limit?

My reasoning here is that it's only a hard-coded limit in the CLR that stops me instantiating a 9gig object on my 64bit platform, and it's entirely unrelated to system resources. Also Lists and Arrays are reference types, and so a List containing lists would only actually contain the references to each list. No one object therefore exceeds the size limit.

Is there any reason why this wouldn't work? I'd try this myself right now but I don't have a memory profiler on hand to verify.

Nick Udell
  • 2,420
  • 5
  • 44
  • 83
  • 5
    An interesting C# question has gone unanswered for 4 minutes. I sense Jon Skeet approaching... – Almo Mar 27 '12 at 15:41
  • 1
    Out of curiousity, does it need to be a `List` or `List>`? Could you use `LinkedList` – James Michael Hare Mar 27 '12 at 15:44
  • 1
    @JamesMichaelHare You could, but you'd have a HUGE amount of extra overhead, as each `LinkedListNode` is a class on it's own, with multiple references. – Reed Copsey Mar 27 '12 at 15:47
  • Related http://stackoverflow.com/q/1087982/38206 – Brian Rasmussen Mar 27 '12 at 15:52
  • @ReedCopsey: that's true, but it depends how big is the contribute of the struct size in the list, if it's really big with respect to the reference size a LinkedList could be a quick solution. – digEmAll Mar 27 '12 at 15:55
  • 1
    @digEmAll If your struct is big enough for that to be true, it shouldn't be a struct at all... The design guidelines say any object >16 bytes should be a class: http://msdn.microsoft.com/en-us/library/ms229017.aspx `LinkedListNode` alone has 3 references, which on x64, is 24 bytes alone... You also have completely different performance semantics with LinkedList vs List. – Reed Copsey Mar 27 '12 at 15:59
  • @ReedCopsey: yeah, that's also true. If it's a value type should be very small, I didn't think to that... – digEmAll Mar 27 '12 at 16:02
  • @digEmAll Question was specifically about collections of struct ;) – Reed Copsey Mar 27 '12 at 16:03
  • My structs are unfortunately large due to a limitation of Direct3D11 requiring structs for instance buffers (as far as I can tell anyway), and these structs must contain several float4s. Total size is about 256bytes each, but I have a metric ton of them. – Nick Udell Mar 30 '12 at 11:07

5 Answers5

13

Now obviously using List is going to give me a list of size 4.2gb give or take, but would using a list made up of smaller lists, which in turn contain a portion of the structs, allow me to jump this limit?

Yes - though, if you're trying to work around this limit, I'd consider using arrays yourself instead of letting the List<T> class manage the array.

The 2gb single object limit in the CLR is exactly that, a single object instance. When you make an array of a struct (which List<T> uses internally), the entire array is "one object instance" in the CLR. However, by using a List<List<T>> or a jagged array, each internal list/array is a separate object, which allows you to effectively have any size object you wish.

The CLR team actually blogged about this, and provided a sample BigArray<T> implementation that acts like a single List<T>, but does the "block" management internally for you. This is another option for getting >2gb lists.

Note that .NET 4.5 will have the option to provide larger than 2gb objects on x64, but it will be something you have to explicitly opt in to having.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
2

The List holds references which are 4 or 8 bytes, depending on if you're running in 32-bit or 64-bit mode, therefore if you reference a 2GB object that would not increase the actual List size to 2 GB but it would only increase it by the number of bytes it is necessary to reference that object.

This will allow you to reference millions of objects and each object could be 2GB. If you have 4 objects in the List and each is 2 GB, then you would have 8 GB worth of objects referenced by the List, but the List object would have only used up an extra 4*8=32 bytes.

The number of references you can hold on a 32-bit machine before the List hits the 2GB limit is 536.87 million, on a 64-bit machine it's 268.43 million.

536 million references * 2 GB = A LOT OF DATA!

P.S. Reed pointed out, the above is true for reference types but not for value types. So if you're holding value types, then your workaround is valid. Please see the comment below for more info.

Kiril
  • 39,672
  • 31
  • 167
  • 226
  • 1
    `List`, when used with a value type, does not hold references (other than a reference to an internal array), but the array itself is an array of the value type directly (`T[]`). A `List` with 1000 elements, where MyStruct is 1000 bytes, would use 1000000 bytes. This is true for the outer list, but not the "inner" one. – Reed Copsey Mar 27 '12 at 16:02
  • @ReedCopsey Thanks for the correction, I've updated the answer to reflect the difference between value types and reference types. – Kiril Mar 27 '12 at 16:11
0

In versions of .NET prior to 4.5, the maximum object size is 2GB. From 4.5 onwards you can allocate larger objects if gcAllowVeryLargeObjects is enabled. Note that the limit for string is not affected, but "arrays" should cover "lists" too, since lists are backed by arrays.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
0
class HugeList<T>
{
    private const int PAGE_SIZE = 102400;
    private const int ALLOC_STEP = 1024;

    private T[][] _rowIndexes;

    private int _currentPage = -1;
    private int _nextItemIndex = PAGE_SIZE;

    private int _pageCount = 0;
    private int _itemCount = 0;

    #region Internals

    private void AddPage()
    {
        if (++_currentPage == _pageCount)
            ExtendPages();

        _rowIndexes[_currentPage] = new T[PAGE_SIZE];
        _nextItemIndex = 0;
    }

    private void ExtendPages()
    {
        if (_rowIndexes == null)
        {
            _rowIndexes = new T[ALLOC_STEP][];
        }
        else
        {
            T[][] rowIndexes = new T[_rowIndexes.Length + ALLOC_STEP][];

            Array.Copy(_rowIndexes, rowIndexes, _rowIndexes.Length);

            _rowIndexes = rowIndexes;
        }

        _pageCount = _rowIndexes.Length;
    }

    #endregion Internals

    #region Public

    public int Count
    {
        get { return _itemCount; }
    }

    public void Add(T item)
    {
        if (_nextItemIndex == PAGE_SIZE)
            AddPage();

        _itemCount++;
        _rowIndexes[_currentPage][_nextItemIndex++] = item;
    }

    public T this[int index]
    {
        get { return _rowIndexes[index / PAGE_SIZE][index % PAGE_SIZE]; }
        set { _rowIndexes[index / PAGE_SIZE][index % PAGE_SIZE] = value; }
    }

    #endregion Public
}
Luc Vaillant
  • 165
  • 1
  • 3
0

There's an interesting post around this subject here:

http://blogs.msdn.com/b/joshwil/archive/2005/08/10/450202.aspx

Which talks about writing your own 'BigArray' object.

Paddy
  • 33,309
  • 15
  • 79
  • 114