7

I will be working with approximately 320,000,000 data points for a high-resolution waveform. Each data point will require 2 floats (XY coordinate) for a total of 8 bytes.

In order to have this memory allocated all at once, I was planning on using a struct such as the following:

public struct Point
{
    public float X; //4-bytes
    public float Y; //4-bytes.
}

Since a struct is a value type, I am assuming that it consumes only the amount of memory necessary for each variable, as well as some small, fixed amount used by the CLR (Common Language Runtime).

Is there a way I can compute how much memory a struct will use during runtime of my application? That is, granted I know the following:

  • How many variables are in the struct.
  • How many bytes are used for each variable.
  • How many instances of the struct will be alive at a given point in time.
Nicholas Miller
  • 4,205
  • 2
  • 39
  • 62
  • the `GC` class has some memory measurement facilities. – j-p May 26 '16 at 22:48
  • 2
    a much more serious question is how you plan to make 320mill of these. In an array, in a list, .... – pm100 May 26 '16 at 22:49
  • 1
    https://blogs.msdn.microsoft.com/joshwil/2005/08/10/bigarrayt-getting-around-the-2gb-array-size-limit/ (or the search term `gcAllowVeryLargeObjects `) might be of use here. – spender May 26 '16 at 23:01
  • Not an answer, but I feel worth pointing out a mistake I always make: Pref Structs of Arrays to Arrays of Structs. It is a math library thing. If you have to deal with say HDF5, AoS will take 10x longer than SoA. Plotting libraries tend to take `plot(xs, ys)` not `plot(xys)`, same for machine learning libraries etc etc. AoS is more logical (to me at least), since each Point is a *thing*, but it does not make for fast – Frames Catherine White May 27 '16 at 00:24

1 Answers1

6

Since a struct is a value type, I am assuming that it consumes only the amount of memory necessary for each variable, as well as some small, fixed amount used by the CLR (Common Language Runtime).

Nope. Value types do not have any inherit overhead. That's the trade-off for not being able to support inheritance.

So you just pay for the size of the fields it contains.


Exceptions:

If you stick a struct in a variable of type Object, it has the object overhead:

I talk about this in a blog post "Of memory and strings". It's implementation-specific, but for the Microsoft .NET CLR v4, the x86 CLR has a per-object overhead of 8 bytes, and the x64 CLR has a per-object overhead of 16 bytes.

What is the memory overhead of a .NET Object

The same thing happens if you cast it to an interface type.

If you stick a struct in an array, the array itself has some object overhead plus an integer to store the array's length. But this is a fixed cost regardless of array length.

If you stick a struct in a List<struct>, you have two objects: the list and the array used by the list. So twice the per-object cost, plus a pointer from the list to the array, plus an integer to know how much of the array is currently is use.

If you stick a struct in a List or List<object>, you have the above overhead, plus the cost of one pointer per item in the list, plus the per-object overhead per item in the list.

Community
  • 1
  • 1
Jonathan Allen
  • 68,373
  • 70
  • 259
  • 447
  • 1
    Which means he will end up with 8 byte for one of his structs. If he uses the List approach – whymatter May 26 '16 at 23:18
  • 1
    Keep in mind that the individual fields may be padded. https://msdn.microsoft.com/en-us/library/hx1b6kkd.aspx – Brian Rasmussen May 26 '16 at 23:25
  • @whymatter but it will be larger in a list. It costs space to store things in a list. the strcut itself is 8 bytes, but will surely need another chunk of space for the list node – pm100 May 26 '16 at 23:54
  • 1
    If we just take a list of structs then the list needs space (pointer to the list, overhead of the list object itself, some (private) variables of the list i.e. _size, _version, _syncRoot) but the biggest part is the array of structs (within the list) and this array itself will cost some overhead and but then we just have the size of 8 bytes for one struct. Means if he adds 10 or 320,000,000 structs it will just cost 10 * 8 bytes + (all the fixed overhead) or 320,000,000 * 8 bytes + (all the fixed overhead) – whymatter May 27 '16 at 00:04
  • 1
    From what I understand, `struct[]` has the least overhead and it is almost the same size as a byte[] (its not because of field padding pointed out by Brain Rasmussen). For a `List` there is no "twice per-object cost", @whymatter is correct in this because there is only 1 such array inside the List (you can see via [Reference Source](http://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs)) – Nicholas Miller May 27 '16 at 14:07
  • It seems like an array is best if resizing isn't needed. If resizing is needed then either `List` or `LinkedList`. The difference being that the `List` has slower resizing, but uses less space while the `LinkedList` has much faster resizing, but consumes a lot of overhead for all the `LinkedListNodes`. – Nicholas Miller May 27 '16 at 14:10
  • LinkedListNode has 3 pointers in addition to the struct itself. So on x64 we're talking about an overhead of (16 + 8*3 = 40 bytes) per element. – Jonathan Allen May 28 '16 at 01:17
  • @NickMiller the "twice the per-object cost" for `List` refers to the 8/16 bytes of overhead for the List itself and another 8/16 bytes of overhead for the array it points to. – Jonathan Allen May 28 '16 at 01:20