Why stackalloc cannot be used with reference types?

Question

If stackalloc is used with reference types as below

var arr = stackalloc string[100];

there is an error

Cannot take the address of, get the size of, or declare a pointer to a managed type ('string')

Why is so? Why CLR cannot declare pointer to a managed type?

score 11 · Answer 1 · answered Feb 24 '16 at 10:37

The Just-In-Time compiler in .NET performs two important duties when converting MSIL as generated by the C# compiler to executable machine code. The obvious and visible one is generating the machine code. The un-obvious and completely invisible job is generating a table that tells the garbage collector where to look for object references when a GC occurs while the method is executing.

This is necessary because object roots can't be just stored in GC heap, as a field of a class, but also stored in local variables or CPU registers. To do this job properly, the jitter needs to know the exact structure of the stack frame and the types of the variables stored there so it can create that table properly. So that, later, the garbage collector can figure out how to read the proper stack frame offset or CPU register to obtain the object root value. A pointer into the GC heap.

That is a problem when you use stackalloc. That syntax takes advantage of a CLR feature that allows a program to declare a custom value type. A back-door around normal managed type declarations, with the restriction that this value type cannot contain any fields. Just a blob of memory, it is up to the program to generate the proper offsets into that blob. The C# compiler helps you generate those offsets, based on the type declaration and the index expression.

Also very common in a C++/CLI program, that same custom value type feature can provide the storage for a native C++ object. Only space for the storage of that object is required, how to properly initialize it and access the members of that C++ object is a job that the C++ compiler figures out. Nothing that the GC needs to know about.

So the core restriction is that there is no way to provide type info for this blob of memory. As far as the CLR is concerned these are just plain bytes with no structure, the table that the GC uses has no option to describe its internal structure.

Inevitably, the only kind of type you can use is the kind that does not require an object reference that the GC needs to know about. Blittable value types or pointers. So System.String is a no-go, it is a reference type. The closest you could possibly get that is "stringy" is:

  char** mem = stackalloc char*[100];

With the further restriction that it is entirely up to you to ensure that the char* elements point to either a pinned or unmanaged string. And that you don't index the "array" out of bounds. This is not very practical.

I always thought of `stackalloc` as a syntactic sugar for a bunch of unnamed variables on stack. For example `stackalloc string[3]` could easily be represented as `string s1; string s2; string s3;` by the compiler (there will be some alignment and unique naming shenanigans). So, [xanatos's answer](https://stackoverflow.com/a/35598183/311618) made more sense for me. — zahir, Jan 19 '23 at 17:06

xanatos · Accepted Answer · 2016-02-24T09:45:31.570

9

The "problem" is bigger: in C# you can't have a pointer to a managed type. If you try writing (in C#):

string *pstr;

you'll get:

Cannot take the address of, get the size of, or declare a pointer to a managed type ('string')

Now, stackalloc T[num] returns a T* (see for example here), so clearly stackalloc can't be used with reference types.

The reason why you can't have a pointer to a reference type is probably connected to the fact that the GC can move reference types around memory freely (to compact the memory), so the validity of a pointer could be short.

Note that in C++/CLI it is possible to pin a reference type and take its address (see pin_ptr)

edited Feb 24 '16 at 09:45

answered Feb 24 '16 at 09:29

xanatos

109,618
12
197
280

You CAN have a pointer to a managed type in C#, it's just not built into the language as it is with C++/CLI. https://msdn.microsoft.com/en-us/library/1246yz8f(v=vs.110).aspx - Use GCHandleType.Pinned as the second argument, then call AddrOfPinnedObject() on the result. – marknuzz May 31 '16 at 22:21
1

@Nuzzolilo Have you tried it? If I remember correctly you'll get an exception if you try to `GCHandleType.Pinned` a managed object. – xanatos Jun 01 '16 at 04:46
I recall having done this years ago, but it's been so long that my memory could be wrong :P. I'll try it again and see what happens. – marknuzz Jun 01 '16 at 04:50
Alright I've run a couple of quick tests. You will get an exception if you use a non-primitive type with this. Arrays, and even strings can be pinned, however. I'm not sure the exact criteria beyond this, perhaps objects are okay as long as there are no unpinned references to other objects. – marknuzz Jun 01 '16 at 04:54
@Nuzzolilo Because arrays and strings are *specially* handled... You receive a pointer to the first element of the array/string. It was probably done for interop support. – xanatos Jun 01 '16 at 06:39

David Haim · Answer 3 · 2016-02-24T10:21:24.523

Because C# works on garbage collection for memory safetiness, as opposed to C++, were you are expected to know neuances of memory management.

for example, take a look at the next code :

public static void doAsync(){
    var arr = stackalloc string[100];
    arr[0] = "hi";
     System.Threading.ThreadPool.QueueUserWorkItem(()=>{
           Thread.Sleep(10000);
           Console.Write(arr[0]);
     });
}

The program will easly crash. because arr is stack allocated, the object + it's memory will disappear as soon as doAsync is over. the lamda function still points to this not-valid-anymore memory address, and this is invalid state.

if you pass local primitives by reference , the same problem will occure.

The schema is:
static objects -> lives throughout the applocation time
local object -> lives as long as the Scope that created them is valid
heap-allocated objects (created with new) -> exist as long as someone hold a reference to them.

Another problem with that is that the Garbage collection works in periods. when an object is local, it should be finalized as soon as the function is over , because after that time - the memory will be overriden by other variables. The GC can't be forced to finalize the object, or shouldn't, anyway.

The good thing though, is that the C# JIT will sometimes (not always) can determine that an object can be safetly be allocated on the stack, and will resort to stack allocation if its possible (again, sometimes).

In C++ on the other hand, you can declare everything enywhere, but this comes with less safetyness then C# or Java, but you can fine-tune you application and achieve high performance - low resources application

*"so no memory exception can happen with primitives."* This is not true. If you were to use ints instead, you are still accessing freed memory if you attempt to access it after the array has been freed from the stack. — Matthew Watson, Feb 24 '16 at 09:47
Arrays in C# are full fledge objects, so no contradiction here -> array of primitives -> object with holds primitives -> same problem. when I say "primitives" I mean variables like `int`, `bool` etc. not compund types like arrays, which are oobjects — David Haim, Feb 24 '16 at 09:49

Matthew Watson · Answer 4 · 2016-02-24T10:13:17.997

-2

I think Xanatos posted the correct answer.

Anyway, this isn't an answer, but instead a counterexample to another answer.

Consider the following code:

using System;
using System.Threading;

namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            doAsync();
            Thread.Sleep(2000);
            Console.WriteLine("Did we finish?"); // Likely this is never displayed.
        }

        public static unsafe void doAsync()
        {
            int n = 10000;
            int* arr = stackalloc int[n];
                ThreadPool.QueueUserWorkItem(x => {
                Thread.Sleep(1000);

                for (int i = 0; i < n; ++i)
                    arr[i] = 0;
            });
        }
    }
}

If you run that code, it will crash because the stack array is being written to after it the stack memory for it has been freed.

This shows that the reason that stackalloc cannot be used with reference types isn't simply to prevent this kind of error.

edited Feb 24 '16 at 10:13

answered Feb 24 '16 at 09:53

Matthew Watson

104,400
10
158
276

lol. but `int[]` is an object (reference type), so you didn't proved anything. – David Haim Feb 24 '16 at 09:59
@DavidHaim There is no `int[]` in this code. `arr` is an `int*`. – Matthew Watson Feb 24 '16 at 10:02
`arr` is the decayed version of `int[N]` which is `int*` , but `int[N]` is an object. there is no contradiciton here. `arr->ToString()` proves that `int[N]` is an object and not value type. is it was a value type it didn't had any methods. – David Haim Feb 24 '16 at 10:04
@DavidHaim There is no `int[3]` anywhere in the code I posted. I'm afraid I don't know what you mean. – Matthew Watson Feb 24 '16 at 10:08
Furthermore, if you try to write `Console.WriteLine(arr.GetType());` you will get a compile error: `Error CS1061 'int*' does not contain a definition for 'GetType' and no extension method 'GetType' accepting a first argument of type 'int*' could be found ` – Matthew Watson Feb 24 '16 at 10:10
I've changed the declaration of `arr` to make it clearer that it is a pointer type. – Matthew Watson Feb 24 '16 at 10:13
`"arr->ToString() proves that int[N] is an object and not value type. is it was a value type it didn't had any methods."`. That's not true. `arr->ToString()` is the same as saying `(*arr).ToString()` which (since arr is an `int*`) will return an int and call `ToString()` on the int. – Matthew Watson Feb 24 '16 at 10:17
OK, my point was that primitives like `int`, `bool` etc. are passed by value. I will change it to be something like `primitives that are passed by referene can also suffer from this problem` I admit that I thought that arrays are always obejcts. – David Haim Feb 24 '16 at 10:20
2

@DavidHaim Arrays ARE always objects, but when you use stackalloc, you are not creating an array, you are reserving some bytes on the stack and assigning the address of the start of those bytes to a pointer type. – Matthew Watson Feb 24 '16 at 10:33
Right, the syntax looks similar to creating an array, but the compiler in this context is treating the expression very differently when using stackalloc. – marknuzz Jun 01 '16 at 04:57

Why stackalloc cannot be used with reference types?

4 Answers4