4

let's assume you need to store a certain amount of data, which is represented either by a struct or a class. You'll only store a maximum of N elements, if adding more, you'd have to first delete an existing one.

You can use any way you like to store the data - a List, an array (or anything else if you'd like).

In terms of Garbage Collection - does it make a difference whether you use a class or struct to represent the information? Does it make a difference whether you store it in a List or Array?

My intuition is the best way would be using an array of structs, as in principle a GC shouldn't ever need to collect anything here (while the array is stored on the heap, its size remains constant and so no collection ever takes place I believe?).

I'm not entirely sure how a list of structs would be represented - is there any form of garbage collection associated with adding/deleting struct elements?

As for using classes instead of structs, I'd expect once removed from the list, or overridden in the array, the old reference would need to be collected, so either would put on GC pressure.

Would appreciate input if my intuition is correct here or if I'm going wrong anywhere! Thanks

Example code:

public struct SStruct
{
   int ABC;
}

public class SClass
{
  int ABC;
}

public class Test
{
   List<SStruct> _data1;
   List<SClass> _data2;
   SStruct[100] _data3;
   SClass[100] _data4;

  public void run()
  {
    var sStruct = new SStruct();
    var sClass = new SClass();

    if(data1.Count > 0)
      _data1.RemoveAt(0);

    if(data2.Count > 0)
      _data2.RemoveAt(0);

    _data1.Add(sStruct);
    _data2.Add(sClass);
    _data3[0] = sStruct;
    _data4[0] = sClass;
}
Bogey
  • 4,926
  • 4
  • 32
  • 57
  • 1
    you may find this [article](https://learn.microsoft.com/en-us/dotnet/standard/design-guidelines/choosing-between-class-and-struct) interesting – Amit Apr 23 '18 at 09:32
  • 1
    Ostensibly the GC could run quicker if using an array of structs because it wouldn't need to check so many references to see if they were alive (reachable from any root). However, when using an array of structs you can end up making the array size bigger (`N * sizeof(struct)` rather than `N * sizeof(reference)`) and that *could* end up with the array going into the Large Object Heap which is likely to negatively impact memory usage. – Matthew Watson Apr 23 '18 at 09:32
  • at 100 elements i don't think this question really makes much of a difference, also a list is just an array internally anyway. Why... do you have a GC problem or performance problem, or is this just academic ? – TheGeneral Apr 23 '18 at 09:34
  • 2
    I'd use a `List` unless or until *there's a demonstrated performance issue*. Don't make odd choices just because (not having even written or profiled the code) you assume that GC pressure is going to be an overriding concern. – Damien_The_Unbeliever Apr 23 '18 at 09:35
  • There will be noticable difference when using millions of elements, but if you need to manipulate so many elements, it is better to do this in chunks. Keep pulling data from db and process it in chunks and then optimize if perfomance is an issue. – FCin Apr 23 '18 at 09:35
  • You may find this thread helpful: https://stackoverflow.com/questions/16679033/cannot-modify-struct-in-a-list?rq=1. It shows how a list of structs is represented and what happens if you´re to change an elemen in the list. – MakePeaceGreatAgain Apr 23 '18 at 09:36
  • Your intuition is basically correct I think. – Evk Apr 23 '18 at 09:48
  • All - Yes, this is mostly an academic question, but would still be curious. Thanks for the input so far. @TheGeneral rather than the size of the list/array, wouldn't it be more relevant how often underlying data changes? E.g. maybe you cache 100 values, but you receive millions of values per second, so lots of replacements – Bogey Apr 23 '18 at 09:55

1 Answers1

3

If in doubt, just run some tests in your circumstances.


This was pretty easy to just test in its most basic form, however it really depends on what you are doing. but i think given you hypothetical your intuition is probably pretty accurate.

This is just a basic test, and feel free to modify it how ever you like, and load the structs out as much as youd like

Given

public struct TestStruct
{
   public int Value;

}
public struct TestStruct2
{
   public int Value;

   public string AString;
}
public class TestClass
{
   public int Value;
}
public class TestClass2
{
   public int Value;
   public string AString;
}
private static TestStruct[] _arrayOfTestStructs = new TestStruct[100];
private static TestStruct2[] _arrayOfTestStruct2 = new TestStruct2[100];
private static TestClass[] _arrayOfTestClass = new TestClass[100];
private static TestClass2[] _arrayOfTestClass2 = new TestClass2[100];

Tests

var sw = new Stopwatch();

Random rand = new Random();
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("struct1");
Console.WriteLine("Start   : {0:N0}", GC.GetTotalMemory(false));
sw.Start();
for (int i = 0; i < 100000000; i++)
{
   _arrayOfTestStructs[rand.Next(100)] = new TestStruct();
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Reset();

Console.WriteLine("Working : {0:N0}", GC.GetTotalMemory(false));
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("Collect : {0:N0}", GC.GetTotalMemory(false));
Console.WriteLine("Class1");
sw.Start();
for (int i = 0; i < 100000000; i++)
{
   _arrayOfTestClass[rand.Next(100)] = new TestClass();
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Reset();
Console.WriteLine("Working : {0:N0}", GC.GetTotalMemory(false));
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("Collect : {0:N0}", GC.GetTotalMemory(false));
Console.WriteLine("struct2");
sw.Start();
for (int i = 0; i < 100000000; i++)
{
   _arrayOfTestStruct2[rand.Next(100)] = new TestStruct2();
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Reset();
Console.WriteLine("Working : {0:N0}", GC.GetTotalMemory(false));
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("Collect : {0:N0}", GC.GetTotalMemory(false));
Console.WriteLine("Class2");
sw.Start();
for (int i = 0; i < 100000000; i++)
{
   _arrayOfTestClass2[rand.Next(100)] = new TestClass2();
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Reset();
Console.WriteLine("Working : {0:N0}", GC.GetTotalMemory(false));
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("Collect : {0:N0}", GC.GetTotalMemory(false));
Console.ReadKey();

Ouput

struct1
Start   : 32,460
1647
Working : 40,652
Collect : 40,448
Class1
2203
Working : 716,936
Collect : 41,744
struct2
1537
Working : 41,744
Collect : 41,660
Class2
2244
Working : 2,265,944
Collect : 43,072

A Flakey .Net Demo here

Summary, there is no mystery here and its fairly intuitive, array of structs are faster and create less garbage than equivalent classes.

Obviously a List will be slower again however assuming you are just overwriting the element and not removing it, will probably create similar garbage.

Anyway this is not a conclusive demo

TheGeneral
  • 79,002
  • 9
  • 103
  • 141