3

Ok so I'm working on an application in Unity, which is highly memory dependent. Due to this I'm storing the data as two byte words inside of byte arrays. I also need every 4000 words or so to be instantly accessible to be moved around and modified without needing to copy data, so instead of using a single massive byte array I'm keeping an array of simple objects containing only one 8000 byte array each (The size of these subunits of data is significant and can't really be changed).

So here's the problem: I expect each object to be less than 9000 bytes big (giving ample space for the administrative overhead of the NET framework). Yet when I actually test the size of these simple objects, they consume approximately 12000 bytes each?! I understand the need for overhead in memory management but this 30% increase just isn't acceptable. The memory efficiency of whatever data structure I use is very important, as the application will need to have as much as 20 GB of data loaded into the memory at once. A 30% overhead means that the application won't be able to run on my computer.

So where is the memory being used and is there a way for me to avoid this large amount of overhead? I'm afraid I don't understand the NET backend well enough to have any idea what's going on. I've already looked over the following thread, but I don't see anything there that should explain the amount of overhead I'm experiencing: C# Object Size Overhead

While I suppose its possible that Unity is responsible for the overhead, I'm not sure if it is as I'm not using any Unity libraries or modules on this specific functionality besides NUnit for testing.

Here's an example class that I'm using to store the data:

public class dataObject {
    public byte[] data;

    public dataObject() {
        this.data = new byte[8000];
    }
}

And this is the code I'm using to test the size of the class in memory:

long mem1;
long mem2;
int N = 10000; // number of objects to create, a larger number gives a more accurate approximation of object size

mem1 = GC.GetTotalMemory(true);
dataObject[] dataObjectArray = new dataObject[N];
for (int i = 0; i < N; i++) {
    dataObjectArray[i] = new dataObject();
}
mem2 = GC.GetTotalMemory(true);

long dataObjectArraySize = (mem2 - mem1) / N;

[Edit]:

I tried using a struct rather than an object for storing the data without any significant change. Here's my struct implementation (which as 3Dave points out, makes more sense to be used than an object instance):

public struct dataStruct {
    public byte[] data;

    public dataStruct() {
        this.data = new byte[8000];
    }
}
  • Have you tried using a `struct` instead of a class? Also, you need to consider word/byte alignment. Objects may be forced to align, introducing some overhead per instance. – 3Dave May 05 '21 at 20:35
  • 2
    It is misguided to think that C# has anything to do with this. In Unity C# is just a "scripting" language because it will all ultimately be converted to C++ then compiled to a platform specific assembly. So I think most of the time it is very limited what you can do about certain things. Specially because you treat your code like .NET when in reality that is not 100% the case... – Jonathan Alfaro May 05 '21 at 20:40
  • @JonathanAlfaro That's not entirely true. That requires the `il2cpp` scripting backend. – 3Dave May 05 '21 at 20:41
  • 3
    Also, `GC.GetTotalMemory` isn't really the right way to determine the size in this case. There are all kinds of things running in Unity that allocate memory, potentially in parallel. That'd be fine in a simple .NET app, but, as has been stated, Unity isn't .NET. – 3Dave May 05 '21 at 20:48
  • @3Dave depends not only on the scripting engine but also on the target platform. That being said Unity uses its own garbage collection albeit https://en.wikipedia.org/wiki/Boehm_garbage_collector – Jonathan Alfaro May 05 '21 at 20:51
  • 2
    Your `dataObjectArray` is an array of references, not an array of instances; the instances could be spread all over the heap. An array of structs would make more sense, but it won't necessarily fix the alignment problem. – 3Dave May 05 '21 at 20:52
  • @3Dave Would objects be aligned in such a way that could generate 3KB of unused space per object? I'm fairly sure the byte array itself shouldn't be causing that much misalignment, unless somehow each byte in the array is being aligned (which I think is doubtful as that could quadruple the size of the array, which is not what I'm seeing). – Katelyn Rogers May 05 '21 at 20:52
  • 1
    Again, the array you're allocating is an array of *references*, not an array of values. Your `byte[] data` member is itself a reference. C# isn't C. So, you've got an array of references to a datatype that itself contains a reference to an array, that could be placed anywhere on the heap and aligned in whatever manner the runtime (or memory manager) sees fit. There's no way that `GC.GetTotalMemory` is going to return anything meaningful. – 3Dave May 05 '21 at 20:55
  • @3Dave So that unfortunately means that the overhead can't be avoided without some more direct control over how the runtime is placing the actual data behind each referenced object/array? – Katelyn Rogers May 05 '21 at 21:00
  • Oh I just read your edit, I need a better way to test memory usage... – Katelyn Rogers May 05 '21 at 21:01
  • 1
    No, it means that you need to restructure your code, and think carefully about the data structures you're using, in order to make it more friendly for the memory manager. At the moment, your code doesn't even measure the amount of data you're allocating, much less the overhead. Enlighten runs at startup and allocates all kinds of stuff, as do other processes in the engine. You're measuring *everything* that the GC is responsible for - not just your code. – 3Dave May 05 '21 at 21:02
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/232008/discussion-between-katelyn-rogers-and-3dave). – Katelyn Rogers May 05 '21 at 21:03

0 Answers0