32

On my 64-bit machine, this C# code works:

new byte[2L * 1024 * 1024 * 1024 - 57]

but this one throws an OutOfMemoryException:

new byte[2L * 1024 * 1024 * 1024 - 56]

Why?

I understand that the maximum size of a managed object is 2 GB and that the array object I'm creating contains more than the bytes I want. Namely, there is 4 bytes (or 8?) for the syncblock number, 8 bytes for MethodTable reference and 4 bytes for the size of the array. That's 24 bytes including padding, so why can't I allocate an array with 2G - 24 bytes? Is the maximum size really exactly 2 GB? If that's the case, what is the rest of 2 GB used for?

Note: I don't actually need to allocate an array with 2 million of bytes. And even if I did, 56 bytes is negligible overhead. And I could easily work around the limit using custom BigArray<T>.

svick
  • 236,525
  • 50
  • 385
  • 514
  • The MethodTable reference would be 8 bytes on 64 bits and in addition to that there may be padding to align. – Brian Rasmussen Jul 07 '11 at 20:29
  • @Brian, right, I forgot the reference has to be 64-bit too. I have edited the question, but the question still holds. – svick Jul 07 '11 at 20:32
  • Upper limit on an int value perhaps? a byte[] can't exceed int maxvalue i would imagine (given an integer element index). – Brad Christie Jul 07 '11 at 20:33
  • @Brad, `int.MaxValue` is 2G - 1, so that doesn't explain it. – svick Jul 07 '11 at 20:34
  • 2
    @svick: As always [Jon Skeet](http://stackoverflow.com/questions/5367320/what-is-the-exact-maximum-limit-of-elements-in-an-array/5367331#5367331) has an answer to a .net question similar. (MSDN mentioning this upper limit can be found here: http://msdn.microsoft.com/en-us/library/ms241064%28VS.80%29.aspx) – Brad Christie Jul 07 '11 at 20:36
  • could just be that you don't have enough free RAM. How much memory do you have in your system, and how many processes are running? – Neil N Jul 07 '11 at 20:36
  • @Neil, the actual RAM should be relevant here, since I have pagefile turned on. I can allocate 11 of the almost 2G arrays, the 12th attempt throws OOM. – svick Jul 07 '11 at 20:43
  • @Brad, Jon's result is exactly the same as mine (2G - 57), but even he doesn't explain why. – svick Jul 07 '11 at 20:48
  • 1
    This allocates from the Large Object Heap. That one probably allocates directly from the low-fragmentation heap in Windows (depending on your operating system). You need to add the overhead that the Windows memory manager adds to heap blocks in the LFH. Size is about right. – Hans Passant Jul 07 '11 at 20:55
  • 1
    Here is a detailed discussion about .NET array overhead. http://stackoverflow.com/questions/1589669/overhead-of-a-net-array I don't think this is particularly interesting discussion however as it is an implementation detail. Unless you are writing an implementation of the .NET Framework, of course. – Can Gencer Jul 07 '11 at 21:02
  • @Can, that was really interesting, thanks. – svick Jul 07 '11 at 21:12

3 Answers3

20

You need 56 bytes of overhead. It is actually 2,147,483,649-1 minus 56 for the maximum size. That's why your minus 57 works and minus 56 does not.

As Jon Skeet says here:

However, in practical terms, I don't believe any implementation supports such huge arrays. The CLR has a per-object limit a bit short of 2GB, so even a byte array can't actually have 2147483648 elements. A bit of experimentation shows that on my box, the largest array you can create is new byte[2147483591]. (That's on the 64 bit .NET CLR; the version of Mono I've got installed chokes on that.)

See also this InformIT article on the same subject. It provides the following code to demonstrate the maximum sizes and overhead:

class Program
{
  static void Main(string[] args)
  {
    AllocateMaxSize<byte>();
    AllocateMaxSize<short>();
    AllocateMaxSize<int>();
    AllocateMaxSize<long>();
    AllocateMaxSize<object>();
  }

  const long twogigLimit = ((long)2 * 1024 * 1024 * 1024) - 1;
  static void AllocateMaxSize<T>()
  {
    int twogig = (int)twogigLimit;
    int num;
    Type tt = typeof(T);
    if (tt.IsValueType)
    {
      num = twogig / Marshal.SizeOf(typeof(T));
    }
    else
    {
      num = twogig / IntPtr.Size;
    }

    T[] buff;
    bool success = false;
    do
    {
      try
      {
        buff = new T[num];
        success = true;
      }
      catch (OutOfMemoryException)
      {
        --num;
      }
    } while (!success);
    Console.WriteLine("Maximum size of {0}[] is {1:N0} items.", typeof(T).ToString(), num);
  }
}

Finally, the article has this to say:

If you do the math, you’ll see that the overhead for allocating an array is 56 bytes. There are some bytes left over at the end due to object sizes. For example, 268,435,448 64-bit numbers occupy 2,147,483,584 bytes. Adding the 56 byte array overhead gives you 2,147,483,640, leaving you 7 bytes short of 2 gigabytes.

Edit:

But wait, there's more!

Looking around and talking with Jon Skeet, he pointed me to an article he wrote on Of memory and strings. In that article he provides a table of sizes:

Type            x86 size            x64 size
object          12                  24
object[]        16 + length * 4     32 + length * 8
int[]           12 + length * 4     28 + length * 4
byte[]          12 + length         24 + length
string          14 + length * 2     26 + length * 2

Mr. Skeet goes on to say:

You might be forgiven for looking at the numbers above and thinking that the "overhead" of an object is 12 bytes in x86 and 24 in x64... but that's not quite right.

and this:

  • There's a "base" overhead of 8 bytes per object in x86 and 16 per object in x64... given that we can store an Int32 of "real" data in x86 and still have an object size of 12, and likewise we can store two Int32s of real data in x64 and still have an object of x64.

  • There's a "minimum" size of 12 bytes and 24 bytes respectively. In other words, you can't have a type which is just the overhead. Note how the "Empty" class takes up the same size as creating instances of Object... there's effectively some spare room, because the CLR doesn't like operating on an object with no data. (Note that a struct with no fields takes up space too, even for local variables.)

  • The x86 objects are padded to 4 byte boundaries; on x64 it's 8 bytes (just as before)

and finally Jon Skeet responded to a question I asked of him in another question where he states (in response to the InformIT article I showed him):

It looks like the article you're referring to is inferring the overhead just from the limit, which is silly IMO.

So to answer your question, actual overhead is 24 bytes with 32 bytes of spare room, from what I gather.

Community
  • 1
  • 1
  • But *why* is it 56 bytes? What's stored in them? Also, `Marshal.SizeOf()` doesn't tell you the actual size of the object, only the size of the object if you marshaled it. – svick Jul 07 '11 at 20:56
  • @svick: See this about Overhead of .NET arrays (http://stackoverflow.com/questions/1589669/overhead-of-a-net-array) –  Jul 07 '11 at 20:59
  • @svick: See the definition of Marshal.Sizeof at MSDN (http://msdn.microsoft.com/en-us/library/5s4920fa.aspx) –  Jul 07 '11 at 21:00
  • the doc actually says exactly what I'm saying: “The size of the specified type in **unmanaged** code.” For example, `Marshal.SizeOf(typeof(bool))` returns 4, but I'm still able to allocate `new bool[almost2G]`, which means the actual size of `bool` is just one byte. – svick Jul 07 '11 at 21:03
  • @svick: I found an obscure article on Hash Tables that may have your answer: "16 bytes overhead, 20 bytes array overhead, 8 bytes for pointer to array, 8 bytes per reference to each array entry, 4 bytes for M, 4 bytes for N, 4 bytes for padding" (http://algs4.cs.princeton.edu/34hash/) –  Jul 07 '11 at 21:04
  • @svick: I think because boolean is non-blittable http://msdn.microsoft.com/en-us/library/75dwhxf7.aspx that is why you are correct that you must use sizeof not Marshal.sizeof, otherwise for byte it is ok, unless I am missing something. –  Jul 07 '11 at 21:08
  • @0A0D, that article seems to about Java, but even then, `byte[]` doesn't contain any array pointers, M or N. I think that's just how the hash table is implemented. – svick Jul 07 '11 at 21:17
  • @svick: oops just realized that. –  Jul 07 '11 at 21:17
  • @svick: see here - http://stackoverflow.com/questions/4676249/tupleint-int-versus-int2-memory-usage/4676309#4676309 –  Jul 07 '11 at 21:26
  • Spare room? What's that? Why can't I put my bytes in there? – svick Jul 08 '11 at 15:01
  • @svick: see here http://stackoverflow.com/questions/1589669/overhead-of-a-net-array –  Jul 13 '11 at 12:07
3

You can actually find this limit explicitly set and verified in .net source code, and it provides some insight on why this was done (efficient implementation of advanced range check elimination):

https://github.com/dotnet/runtime/blob/b42188a8143f3c7971a7ab1c735e31d8349e7991/src/coreclr/vm/gchelpers.cpp

inline SIZE_T MaxArrayLength()
{
    // Impose limits on maximum array length to prevent corner case integer overflow bugs
    // Keep in sync with Array.MaxArrayLength in BCL.
    return 0X7FFFFFC7;
}
...

if ((SIZE_T)cElements > MaxArrayLength())
    ThrowOutOfMemoryDimensionsExceeded();
Ian Griffiths
  • 14,302
  • 2
  • 64
  • 88
Alexander
  • 4,153
  • 1
  • 24
  • 37
3

One thing is for sure is that you cannot have an odd number of bytes, it is usually in multiples of the native word size which is 8bytes on a 64 bit process. So you could be adding another 7 bytes to the array.

yan bellavance
  • 4,710
  • 20
  • 62
  • 93
  • 1
    It's usually in multiples of native word sizes. So for 64-bit process, that would be 8 bytes. – svick Jul 07 '11 at 20:45