Allocation-free List-like small data struct for .NET 5+

Question

Are there any data structures for .NET (the recent versions) that have an initial "small" mode that doesn't allocate memory? Something that would use initially memory on the stack (stackalloc), then if that is all used up switch to using the heap and only then put pressure on GC. A SmallList<T> with a a similar interface to List<T> would be the most useful when you know in most cases only a few items are added.

In C++ there is at least one good implementation of this idea used in LLVM, the SmallVector and some similar small types for sets/maps. What is the difference between std::vector and llvm::SmallVector? which one to use when?
Also this video: CppCon 2016: Chandler Carruth “High Performance Code 201: Hybrid Data Structures"

I've been googling a bit and couldn't find anything about an existing implementation, and if it's even possible to do. It would have to be limited to value types more than certainly, but even then not sure how you would write some T value into the stackalloc array and read it back - for ex. if T is an int, is there now an API to "reinterpret_cast" a range of bytes from a span to an int directly, not by combining the 4 bytes one by one?

The closest data-struct with this idea I could find is the SStringBuilder from the Towel lib: https://github.com/ZacharyPatten/Towel/blob/main/Sources/Towel/SStringBuilder.cs

// SStringBuilder is a small helper for initializing strings.
// It will append to the span until the capacity is reached
// and then it will revert to a StringBuilder if necessary

Edit
Found this API, may be what's needed https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.memorymarshal.read

Edit 2

Made a hacky SmallList that just works with the small number of values, throwing an exception if capacity is reached instead of switching to a List. Tried a few ways of doing this:

with Span and using MemoryMarshal.Read/Write
with Span and read/write directly the span
with ArrayPool.Shared.Rent/Return
with a List
with a List with capacity preallocated in constructor

Up to close to 32 values, the Span versions are quite a lot faster, below are some results from BenchmarkDotNet - bench just adds N small structs (2 int fields) to the list and then gets each value and adds up the fields. N was 1,2,4... and a few more up to 128, was curious at what point the advantage disappears, in practice the stack part should be for < 10 values.

Results for 2 values:
|         SpanListBench |      2 |   6.126 ns | 0.0188 ns | 0.0147 ns |
|   TypedSpanlListBench |      2 |   5.403 ns | 0.0147 ns | 0.0123 ns |
|         PoolListBench |      2 |  24.182 ns | 0.0375 ns | 0.0351 ns |
|             ListBench |      2 |  18.160 ns | 0.1737 ns | 0.1540 ns |
|     ListBenchPrealloc |      2 |  13.754 ns | 0.0978 ns | 0.0867 ns |

Results for 8 values:
|        SpanListBench  |      8 |  18.045 ns | 0.0494 ns | 0.0438 ns |
|    TypedSpanListBench |      8 |  16.467 ns | 0.0565 ns | 0.0472 ns |
|         PoolListBench |      8 |  31.558 ns | 0.0735 ns | 0.0651 ns |
|             ListBench |      8 |  44.527 ns | 0.1977 ns | 0.1849 ns |
|     ListBenchPrealloc |      8 |  29.989 ns | 0.2079 ns | 0.1736 ns |

 ref struct TypedSpanList<T> where T:struct
    {
        private Span<T> buffer_;
        private int count_;

        public TypedSpanList(Span<T> buffer)
        {
            buffer_ = buffer;
            count_ = 0;
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public void Add(T value)
        {
            if (count_ >= buffer_.Length)
            {
                throw new InvalidOperationException();
            }
            buffer_[count_] = value;
            count_++;
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public T Get(int index)
        {
            if (index >= buffer_.Length)
            {
                throw new InvalidOperationException();
            }
            return buffer_[index];
        }
    }

        [Benchmark]
        public  int TypedListBench()
        {
            int count = 0;
            Span<Test> s = stackalloc Test[Number];
            var sl = new TypedSpanList<Test>(s);

            for (int i = 0; i < Number; i++)
            {
                sl.Add(new Test() { a = i, b = i + 1 });
            }

            for (int i = 0; i < Number; i++)
            {
                var t = sl.Get(i);
                count += t.a + t.b;
            }

            return count;
        }

Structs and primitive types. Small depends on the context, but it's usually 1-2, maybe 4 values. Something that would fit without issues on the stack. — Gratian Lup, Mar 14 '22 at 04:55
By the *"List-like small data struct"* you mean that the data structure should be implemented as a `struct`, and not as a `class`, correct? — Theodor Zoulias, Mar 14 '22 at 05:13
I think it would have to be a `struct` otherwise you negate the advantage of not allocating on the heap in the "small size" case. — Slugart, Mar 15 '22 at 03:36
Is there any reason for not using Array of T ? i think that will solve your problem. — Nihat Mert, Mar 17 '22 at 11:18
ref structs can not implement interfaces. And lacking IEnumerable makes it a crippled list. — Corniel Nobel, Mar 21 '22 at 08:34
I think you're overestimating the cost of the GC's Gen 0 heap. It's **insanely cheap**. I can assure you: you won't benefit from trying to abuse `stackalloc` and `ref`-structs into having GC-free collections. And if performance and/or memory _really matter_ that much to you, why are you using .NET in the first place? — Dai, Mar 22 '22 at 15:09
no out-of-the-box solution, and your exploratory implementation is a good intuition on how to implement allocation free frugal object. I recommend you this video (https://www.youtube.com/watch?v=3r6gbZFRDHs&list=WL&index=17) if you want to have more insights about this topic. — glihm, Mar 22 '22 at 19:21

score 0 · Answer 1 · answered Mar 21 '22 at 09:14

There is - as far as I know - no out-of-the-box solution by .NET.

The most important question is: what do need? List has an internal versioning mechanism that checks if you did updated while iterating. Do you need that for your small version? Iterating is challenging topic on its own: Check out .NET's go on List.

If you use a struct (by ref does not allow you to implement any interface becides IDisposable) some of you no-allocation gain might be lost when you talk via the interface (due to boxing).

So, what is most important? Mutating it's elements? passing the elements along? Iterate through all/some of the elements?

It's stating the obvious that it might be premature optimization, so I assume that you already took that into consideration.

And as far as I know, aggressive inlining does not work if you throw exceptions in those methods.

You can implement `IEnumerable` and `IEnumerator` using mutable structs which won't be boxed, you just need to be careful to ensure your types implement the "foreachable"-pattern as that's what the C# compiler will invoke _instead_ of `IEnumerator.GetEnumerator()`: https://stackoverflow.com/questions/3168311/why-do-bcl-collections-use-struct-enumerators-not-classes — Dai, Mar 22 '22 at 15:07
mutable structs, not ref structs. as in this example. And *IF* you pass your struct via the interface (ICollection, IENumerable), the compiler will not know that. Of course, If you don't communiate via the interface, that does not have to be a problem. — Corniel Nobel, Mar 23 '22 at 19:41

Allocation-free List-like small data struct for .NET 5+

1 Answers1