Memory confusion

Question

I have an app in which I'm trying to create a very large "cube" of bytes. A 3 dimensional array (~1000x1000x500) saves all of the values that I'm interested in - but I'm getting out of memory problems. While this was expected, the BEHAVIOR of the various OOM messages that I'm getting has been quite confusing. First:

Foo[,,] foo1 = new Foo[1000, 1000, 500];

Fails with an OOM error, but this does not: Foo[,,] foo1 = new Foo[250, 1000, 500]; Foo[,,] foo2 = new Foo[250, 1000, 500]; Foo[,,] foo3 = new Foo[250, 1000, 500]; Foo[,,] foo4 = new Foo[250, 1000, 500];

Shouldn't these two sets of code consume essentially the same amount of memory?

Also, I was originally getting the error message when ~1.5GB had been consumed, but I assumed that by switching it to a 64bit application would let me use MUCH more memory before failing.

Am I running into stack space limitations? And if so, how can I instantiate this structure entirely on the heap without it ever having to exist (as a single entity) on the stack?

Thanks in advance - I look forward to any light that anyone can shead on this behavior.

I think if you compile for 64x you are going to be ok. I think also that you simply go over the limit for an object with the first array. so yeah the two statement take the approx the same space in memory but it's only because you run over the limit per object of .net (at least for 32x) — Rémi, May 09 '13 at 15:58
Using "new" will put it on the heap. It probably works when you split it up because you are no longer requesting 500 million _contiguous_ "Foo" size pieces. — Markku K., May 09 '13 at 15:58
Are you really sure you need a pre-allocated array of `Foo`. Explain what you are trying to accomplish by using an array of that size, there may be better ways of handling it (perhaps a nested dictionary that only contains populated values you need `Dictionary>>`) — Scott Chamberlain, May 09 '13 at 16:03
The working arrays are a fourth of the size of the array that fails. Your running in the single object limit. You can raise that limit http://stackoverflow.com/questions/1087982/single-objects-still-limited-to-2-gb-in-size-in-clr-4-0 if you want. 64-bit process still have a 2GB object limit unless you raise it. **Enable the flag and it should solve your problem.** — Security Hound, May 09 '13 at 16:17
Markku - perfect - I think that answers my question about the memory behavior. I wasn't taking into account that they had to be contiguous. Thanks! — eejai42, May 09 '13 at 16:44
Scott - that is a fantastic idea (re the nested dictionary). I had never thought of that. That _could_ work, because many of the values (at least in most cases) will have no value... hmmmm - very interesting. What I'm trying to do is create a 3 dimensional "pixel" cube. I then need to repeatedly iterate over the "pixels" and configure meta data about that particular position in space. I **know** that this is not an efficient model memory-wise, but it is a **very** efficient way to model the problem conceptually. Anywho - Thanks Scott, great suggestion!! — eejai42, May 09 '13 at 16:49

score 4 · Answer 1 · answered May 09 '13 at 15:59

Thing is, when you allocate an array, you need contiguous memory.

It is more likely to find 10 blocks of contiguous memory in RAM with a size of 10MB each than to find 1 huge block of 100MB.

Let's say you had 100 bytes of RAM with addresses 0 to 99. If you allocated a block of memory with size 1 byte at position 23 for example, although you have 99 bytes of RAM remaining, if you wanted to allocate a block of memory with a size of 99 bytes, you will fail because the memory must be contiguous. The biggest block you'd be able to allocate in such a case would be 76 bytes long.

Great explanation as well. I hadn't considered the impact of the fact that it must contiguous. Thanks! — eejai42, May 09 '13 at 17:13

score 4 · Answer 2 · edited May 23 '17 at 12:03

A 32 bit app is limited to 4 GB of address space, so that's the upper limit. If the process runs on a 32 bit OS, this is further limited to 2 or 3 GB depending on the setup of the app and the OS.

In the first case you're allocating one big array. In .NET arrays are allocated on the heap, so the stack space is not the issue here. Given the numbers in the question I assume the array is around 1.5 GB. To handle that the CLR needs a contiguous block of memory. If you allocate the same number of bytes in smaller chunks (as in the second example) the runtime will stand a better chance of allocating the memory as needed.

Still, with at least 2 GB available you would think that 1.5 GB should not be a problem, but the fact is that a process doesn't use address space strictly from one end to the other. Once DLLs have been loaded into the process the address space is fragmented.

In my experience 32 bit managed apps (on 32 bit OS) are usually limited to around 1.5 GB of heap space, which would explain the OOM you're seeing.

Running the same app on 64 bit OS will give the app access to the entire 4 GB address space, which will affect the space available for the managed heap.

Turning the app into a 64 bit app, changes the address space size from 4 GB to 8 TB. However, even in this case you should keep in mind that the default maximum size of any .NET object is 2 GB. See this question for details.

dandan78 · Answer 3 · 2013-05-09T16:16:03.253

3

To me, that looks like a memory fragmentation issue. Also, note that new uses the heap.

In the first example you are asking for a very sizeable chunk of memory and it is possible that the OS, regardless of how much RAM you have in your system, might not be able to find a contiguous block of memory that large.

The smaller allocations work because smaller contiguous blocks of memory are always more plentiful than bigger ones.

edited May 09 '13 at 16:16

answered May 09 '13 at 15:59

dandan78

13,328
13
64
78

Nitpicking: In C# `new` doesn't necessarily equals a heap allocation. – Brian Rasmussen May 09 '13 at 20:42

score 3 · Accepted Answer · edited May 23 '17 at 10:25

EDIT

I was musing on a more fully featured implementation of my answer, I thought I'd append. I'm not sure if the parallelization would help, perhaps it depends on the initializer.

using System;
using System.Linq;

public static T[][][] NewJagged<T>(
        int h,
        int w,
        ing d,
        Func<int, int, int, T> initializer = null,
        bool parallelize = true)
{
    if (h < 1)
    {
        throw new ArgumentOutOfRangeException("h", h, "Dimension less than 1.")
    }

    if (w < 1)
    {
        throw new ArgumentOutOfRangeException("w", w, "Dimension less than 1.")
    }

    if (d < 1)
    {
        throw new ArgumentOutOfRangeException("d", d, "Dimension less than 1.")
    }

    if (initializer == null)
    {
        initializer = (i, j, k) => default(T);
    }

    if (parallelize)
    {
        return NewJaggedParalellImpl(h, w, d, initializer);
    } 

    return NewJaggedImpl(h, w, d, initializer);
}

private static T[][][] NewJaggedImpl<T>(
        int h,
        int w,
        int d,
        Func<int, int, int, T> initializer)
{
    var result = new T[h][][];
    for (var i = 0; i < h; i++)
    {
        result[i] = new T[w][];
        for (var j = 0; j < w; j++)
        {
            result[i][j] = new T[d];
            for (var k = 0; k < d; k++)
            {
                result[i][j][k] = initializer(i, j, k);
            }
        }
    }

    return result;
}

private static T[][][] NewJaggedParalellImpl<T>(
        int h,
        int w,
        int d,
        Func<int, int, int, T> initializer)
{
    var result = new T[h][][];
    ParallelEnumerable.Range(0, h).ForAll(i =>
    {
        result[i] = new T[w][];
        ParallelEnumerable.Range(0, w).ForAll(j =>
        {
            result[i][j] = new T[d];
            ParallelEnumerable.Range(0, d).ForAll(k =>
            {
                result[i][j][k] = initializer(i, j, k);
            });
        });
    });

    return result;
}

This makes the function completely generic but still leaves you the simple syntax,

var foo1 = NewJagged<Foo>(1000, 1000, 500);

You could however get fancy and populate in paralell on initialization,

var foo2 = NewJagged<Foo>(
    1000,
    1000,
    5000,
    (i, j, k) =>
        {
            var pos = (i * 1000 * 500) + (j * 500) + k;
            return ((pos % 2) == 0) ? new Foo() : null;
        });

in this instance, populating with a checkerboard effect (I think.);

This may not initially seem to answer your problem ...

If you had a function, something like this

public static T[][][] ThreeDimmer<T>(int h, int w, int d) where T : new()
{
    var result = new T[h][][];
    for (var i = 0; i < h; i++)
    {
        result[i] = new T[w][];
        for (var j = 0; j < w; j++)
        {
            result[i][j] = new T[d];
            for (var k = 0; k < d; k++)
            {
                result[i][j][k] = new T();
            }
        }
    }

    return result;
}

Then you would have encapsulated the initialization of a 3 dimensional jagged array of reference types. This would allow you to do,

Foo[][][] foo1 = ThreeDimmer<Foo>(1000, 1000, 500);

This would avoid the memory fragmentation issues of multidimensional arrays. It would also avoid other pitfalls and limitations, giving you a faster more flexible jagged array instead.

Wow - this has been BY FAR the most helpful post I've ever made. Literally every single response has been enormously helpful. This solution, combined with upgrading to the 4.5 framework have resolved all of my issues! Thanks EVERYONE! — eejai42, May 09 '13 at 17:23

Memory confusion

4 Answers4