3

Looking over the source of List<T>, it seems that there's no good way to access the private _items array of items.

What I need is basically a dynamic list of structs, which I can then modify in place. From my understanding, because C# 6 doesn't yet support ref return types, you can't have a List<T> return a reference to an element, which requires copying of the whole item, for example:

struct A {
  public int X;
}

void Foo() {
  var list = new List<A> { new A { X = 3; } };

  list[0].X++; // this fails to compile, because the indexer returns a copy

  // a proper way to do this would be
  var copy = list[0];
  copy.X++;
  list[0] = copy;


  var array = new A[] { new A { X = 3; } };

  array[0].X++; // this works just fine    
}

Looking at this, it's both clunky from syntax point of view, and possibly much slower than modifying the data in place (Unless the JIT can do some magic optimizations for this specific case? But I doubt they could be relied on in the general case, unless it's a special standardized optimization?)

Now if List<T>._items was protected, one could at least subclass List<T> and create a data structure with specific modify operations available. Is there another data structure in .NET that allows this, or do I have to implement my own dynamic array?

EDIT: I do not want any form of boxing or introducing any form of reference semantics. This code is intended for very high performance, and the reason I'm using an array of structs is to have them tighly packed on memory (and not everywhere around heap, resulting in cache misses).

I want to modify the structs in place because it's part of a performance critical algorithm that stores some of it's data in those structs.

Jakub Arnold
  • 85,596
  • 89
  • 230
  • 327
  • ...almost tempted to close as dupe? [Why isn't Array a generic type?](http://stackoverflow.com/q/14324987/327083) – J... Oct 06 '16 at 21:34
  • 5
    That said - why do you need to store structs? If you want reference semantics, why not use a class instead? – J... Oct 06 '16 at 21:40
  • see also this answer : http://stackoverflow.com/a/51585/327083 – J... Oct 06 '16 at 21:42
  • @J... I'm using structs because of performance reasons (extremely tight loop, need packed memory to avoid cache misses). I don't want reference semantics, I just want "modify field in place", instead of "copy-out, modify, copy-back" on the whole struct ... the – Jakub Arnold Oct 06 '16 at 21:52
  • @J... the first linked answer is completely irrelevant, the second one solves a different problem, because it introduces boxing. I absolutely do not want any form of boxing, I'm trying to avoid reference semantics. – Jakub Arnold Oct 06 '16 at 21:54
  • 2
    Then why did you choose to do this project in C#? Sounds like the wrong tool for the job. – J... Oct 06 '16 at 21:55
  • @J... How is this relevant? C# can achieve extremely high performance, especially when GC isn't even run (because there are no memory allocations) and the memory is tighly packed with structs, which is exactly the context of this question. – Jakub Arnold Oct 06 '16 at 21:58
  • It's relevant because you are trying to do something that is extremely un-csharp-y in mutating conceptually immutable objects. The *potential* for performance is not really meaningful if you don't have the tools to exploit it. – J... Oct 06 '16 at 22:03
  • @J... C# and .NET in general is being used for high performance code all over the place. The reason why structs with in-place memory semantics are in C# is performance. Just because they have copy semantics doesn't mean they shouldn't be mutated. There are tons of tools to write fast C# code, and the question I posted has multiple solutions. The reason why I asked is if someone encountered this before and maybe if there is a data structure in .NET already that solves this, before I go on writing my own. – Jakub Arnold Oct 06 '16 at 22:08
  • @J...: There's nothing conceptually immutable about his objects. – Ben Voigt Oct 06 '16 at 22:09
  • @BenVoigt I meant treating value types by reference (ie: pointer semantics) which feels a lot more c++ to me than c#. – J... Oct 06 '16 at 22:19
  • @J...: It is more C++-esque than pervasive C# style -- but mutable structures is very standard in high-performance applications of C#. C# performance can be quite good, and if you care more about throughput (average performance) than jitter (worst-case performance) C# remains a very nice tool. – Ben Voigt Oct 06 '16 at 22:20
  • @BenVoigt In a lot of cases, I agree, certainly. For some types of things, though, it's almost more work just working around the language when others do it much more directly. I use tight structures in other languages and it's frustrating to not have a pointer when you want one in C# sometimes. I find trying to do this type of thing in C# ends up in a mess that doesn't really integrate well into other code that you want to write in a C# style...and if you're going to have that, and have to wrap it, I'd rather just write it in something else...and wrap it. – J... Oct 06 '16 at 22:25
  • My library has `InternalList`, which allows high-performance access to the internal array via the `InternalArray` property: http://core.loyc.net/collections/internal-list.html – Qwertie Jun 28 '21 at 02:52

1 Answers1

3

Is there another data structure in .NET that allows this, or do I have to implement my own dynamic array?

Neither.

There isn't, and can't be, a data structure in .NET that avoids the structure copy, because deep integration with the C# language is needed to get around the "indexed getter makes a copy" issue. So you're right to think in terms of directly accessing the array.

But you don't have to build your own dynamic array from scratch. Many List<T>-like operations such as Resize and bulk movement of items are provided for you as static methods on type System.Array. They come in generic flavors, so no boxing is involved.

The unfortunate thing is that the high-performance Buffer.BlockCopy, which should work on any blittable type, actually contains a hard-coded check for primitive types and refuses to work on any structure.

So just go with T[] (plus int Count -- array length isn't good enough because trying to keep capacity equal to count is very inefficient) and use System.Array static methods when you would otherwise use methods of List<T>. If you wrap this as a PublicList<T> class, you can get reusability and both the convenience of methods for Add, Insert, Sort as well as direct element access by indexing directly on the array. Just exercise some restraint and never store the handle to the internal array, because it will become out-of-date the next time the list needs to grow its capacity. Immediate direct access is perfectly fine though.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Thank you! You're the first of almost 20 people who I've asked who didn't start by 10 minute argument that what I want is wrong and should be doing X instead. The static `System.Array` methods are wonderful, I wish I'd known about them a few years ago hehe. – Jakub Arnold Oct 06 '16 at 22:12
  • @JakubArnold I'll kindly point out that this is exactly the content of the answer I posted above that you dismissed as "completely irrelevant". You'll also notice I posted it *before* making the point that there may be better tools for this job. – J... Oct 06 '16 at 22:16
  • @JakubArnold: I guess I need to qualify my initial statement. It would be completely possible to have a `PublicList` that works exactly like `List` and also yields a direct reference to the array, for direct access purposes. But you can't have a container indexer work the way you want. – Ben Voigt Oct 06 '16 at 22:16
  • @J...: The answer you referenced stores boxed references to structures, not the structures themselves, and has terrible performance due to loss of locality. – Ben Voigt Oct 06 '16 at 22:18
  • @BenVoigt No - the very first one. [Arrays implement generic interfaces](http://stackoverflow.com/a/15561100/327083) – J... Oct 06 '16 at 22:20
  • @J...: The only mention I can find in there of operations that change collection size, like `Insert` or `Add` or `Remove`, is that the explicit interface implementations **will throw `NotSupportedException`**. Nothing in there talks about how to actually get there with arrays. – Ben Voigt Oct 06 '16 at 22:22
  • @BenVoigt It goes on to discuss the `Array` class and non-generic methods as well. – J... Oct 06 '16 at 22:33
  • @J...: But it doesn't discuss the methods that are actually useful here, such as `Resize` (which IS generic). – Ben Voigt Oct 06 '16 at 22:39