6

Question:

Do all CLR value types, including user-defined structs, live on the evaluation stack exclusively, meaning that they will never need to be reclaimed by the garbage collector, or are there cases where they are garbage-collected?

Background:

I have previously asked a question on SO about the impact that a fluent interface has on the runtime performance of a .NET application. I was particuarly worried that creating a large number of very short-lived temporary objects would negatively affect runtime performance through more frequent garbage-collection.

Now it has occured to me that if I declared those temporary objects' types as struct (ie. as user-defined value types) instead of class, the garbage collector might not be involved at all if it turns out that all value types live exclusively on the evaluation stack.

(This occured to me mainly because I was thinking of C++'s way of handling local variables. Usually being automatic (auto) variables, they are allocated on the stack and therefore freed when the program execution gets back to the caller — no dynamic memory management via new/delete involved at all. I thought the CLR just might handle structs similarly.)

What I've found out so far:

I did a brief experiment to see what the differences are in the CIL generated for user-defined value types and reference types. This is my C# code:

struct SomeValueType     {  public int X;  }
class SomeReferenceType  {  public int X;  }
.
.
static void TryValueType(SomeValueType vt) { ... }
static void TryReferenceType(SomeReferenceType rt) { ... }
.
.
var vt = new SomeValueType { X = 1 };
var rt = new SomeReferenceType { X = 2 };
TryValueType(vt);
TryReferenceType(rt);

And this is the CIL generated for the last four lines of code:

.locals init
(
    [0] valuetype SomeValueType vt,
    [1] class SomeReferenceType rt,
    [2] valuetype SomeValueType <>g__initLocal0,  //
    [3] class SomeReferenceType <>g__initLocal1,  // why are these generated?
    [4] valuetype SomeValueType CS$0$0000         //
)

L_0000: ldloca.s CS$0$0000
L_0002: initobj SomeValueType  // no newobj required, instance already allocated
L_0008: ldloc.s CS$0$0000
L_000a: stloc.2
L_000b: ldloca.s <>g__initLocal0
L_000d: ldc.i4.1 
L_000e: stfld int32 SomeValueType::X
L_0013: ldloc.2 
L_0014: stloc.0 
L_0015: newobj instance void SomeReferenceType::.ctor()
L_001a: stloc.3
L_001b: ldloc.3 
L_001c: ldc.i4.2 
L_001d: stfld int32 SomeReferenceType::X
L_0022: ldloc.3 
L_0023: stloc.1 
L_0024: ldloc.0 
L_0025: call void Program::TryValueType(valuetype SomeValueType)
L_002a: ldloc.1 
L_002b: call void Program::TryReferenceType(class SomeReferenceType)

What I cannot figure out from this code is this:

  • Where are all those local variables mentioned in the .locals block allocated? How are they allocated? How are they freed?

  • (Off-topic: Why are so many anonymous local variables needed and copied to-and-fro, only to initialize my two local variables rt and vt?)

Community
  • 1
  • 1
stakx - no longer contributing
  • 83,039
  • 20
  • 168
  • 268
  • Maybe I should add that I'm not interested so much in the specific way that Microsoft's CLR handles this, but how `struct`s are allocated according to the (general) CLI specification. – stakx - no longer contributing May 15 '10 at 13:41
  • The CLI specification *doesn't* specify this. Whether objects are allocated on the stack or heap is an implementation detail. – Aaronaught May 15 '10 at 13:55
  • FWIW, your accepted answer is full of misinformation. Several people explained this in comments but a moderator has deleted all of the comments. – J D May 21 '12 at 08:39
  • @Jon, thanks for the hint. I cannot remember, or perhaps was never notified about, these deleted comments. If there's any imprecision or misinformation in Aaronaught's answer, I would quite welcome corrections (comments *or* direct edits)! – stakx - no longer contributing May 21 '12 at 09:09
  • @Aaronaught: Exactly *where* things are stored is an implementation detail, but every struct-type storage location must refer to a semantically-distinct instance of that type; in a sequence like `struct1=struct2; struct1.f1=v1; struct2.f2=v2;`, the write to `struct1.f1` must not affect `struct2.f1`, and the write to `struct2.f2` must not affect `struct1.f2`. The most practical way to ensure such semantics is to have struct assignment copy of all public and private fields (implementations could in theory use copy-on-write, etc. but copying fields is almost certainly easier and faster). – supercat May 21 '12 at 15:07
  • @supercat: What you're saying is correct and is actually a major part of the caution in my answer. Structs are *supposed* to guarantee exactly the behaviour you specify, but if you allow those structs to contain mutable reference types, you violate that assumption, such that it is entirely possible for `struct1.f1.p1=v1` to change the value of `struct2.f1.p1` (if `f1` is a ref type). If your reference types are immutable then maybe you can relax your guard, but IME it is *so* easy for a once-immutable type to become mutable just to fix one little bug or optimize one little method... – Aaronaught May 21 '12 at 17:35
  • @Aaronaught: Structs should generally only contain mutable reference types in cases where it is clear that what the struct holds is the *identity* of the class object, rather than its content (e.g. one could have a `Dictionary` to keep track of which forms are associated with which file names; the filename string and the *identity* of the form in the associated `KeyValuePair` structure would be immutable, even if the form was moved around and the document within it edited. – supercat May 21 '12 at 17:44
  • @supercat: I'm not sure what it is about your example that you think makes this identity assumption clear, and it doesn't really jibe with any guidance I've ever seen or read. .NET generics are a special case because they are specifically optimized for use with value types (i.e. they don't need to box, compared to the old `ArrayList` and so on) - nor is `KeyValuePair` specifically intended to be used with reference types, it can simply used with *any* types. A generic type without a `struct` constraint is a very different beast from a value type with explicit ref-type fields. – Aaronaught May 21 '12 at 17:54
  • 1
    I'm sorry to say, but it seems as though a lot of people contributing here are only trying to rationalize their choices rather than really try to understand the difference between reference vs. value types and what factors could/should govern the choice of one over the other. Yes, there will be valid reasons, under certain circumstances, to use a mutable `struct` and/or one with reference-typed fields; however, that decision should be supported by reasons relevant to the problem domain, not straw-man arguments intended to poke holes in its classification as something to be avoided. – Aaronaught May 21 '12 at 17:58
  • @Aaronaught: Objects of type `Form` (or derivatives) are generally expected to change in their lifetime and have an existence outside any variables or fields that hold references to them. A field of type `Form` would be expected to *identify* a form, rather than "hold" one; that would be true if a field of such type were included in a non-generic structure. If `Foo.f` is a field of type `Form`, one should expect that `Foo.f.Width` could change arbitrarily, even if `Foo` is stored in an immutable location. – supercat May 21 '12 at 17:59
  • @supercat: In essence you seem to be saying that *any* reference type is automatically OK to use in a `struct` or as part of an immutable object because it's name somehow indicates that it's an identity (to whom? on what basis?). Maybe this would have made sense when referring to pointer types in unmanaged languages which have their own explicit syntax, but it's almost never going to be the case in .NET. Can you actually give an example of any object tree which *wouldn't* be valid under your proposed rule? – Aaronaught May 21 '12 at 18:02
  • @Aaronaught: For a struct field to hold something like an array or `List` would be dubious, unless the instance is created by the struct, and will never be mutated nor exposed to the outside world, since someone might expect that copying a struct would copy the data within it. Struct with mutable-class-type fields should generally only expose or mutate the objects referred to by those fields in cases where one would expect that copying a field won't duplicate the object it refers to (as would be the case with, e.g., `Form`). – supercat May 21 '12 at 18:29
  • @supercat There is nothing "dubious" about have a struct with an array in it. For example, you can use this to augment nullable arrays with a structural pretty printer without incurring two levels of indirection. – J D May 21 '12 at 19:08
  • @JonHarrop: Would the struct be exposing the array directly, would it expose a read-only indexer on an array to which it holds the only reference, would it expose a read-only indexer to an array received from elsewhere, or would it expose a read-write indexer? The second of those choices I would have no problem with, but I would likely consider the last of those choices a gross violation of the Principle of Least Astonishment. The other two I would consider generally dubious, though there may be some cases when they would be appropriate. – supercat May 21 '12 at 21:07
  • @supercat The value type wrapper could expose anything, even the array in all of its mutable glory. The reason is simply that the value type is not wrapping the array but, rather, the reference to the array. So, for example, the value type's methods can handle null in useful and interesting ways. For example, F# often pretty prints its `None` value of the `option` type as `null` because that is its internal representation. Had they used a value type wrapper it could always pretty print correctly even when the reference inside the value type is `null`. – J D May 21 '12 at 22:22
  • @JonHarrop: Such a usage sounds like my first choice above, and sounds like a case where it may be appropriate. The biggest problem with it is that there's nothing inherent in such struct itself which would indicate whether someone who wants to change the value of `MyStruct.Arr[5]` should be expected to replace `Arr` with a clone the array and modify element 5 of that clone, or should be expected to modify element 5 of the existing array; likewise nothing indicates whether someone who wants to copy the data in `MyStruct` should be expected to make a deep or shallow copy. – supercat May 21 '12 at 22:54
  • @JonHarrop: It's too bad that .net doesn't have more compiler-supported array types (e.g. an abstract `ReadableArray`, with derivatives `Array`, `ImmutableArray`, and `ReadonlyArrayReference`). As it is, if one tries to use a read-only wrapper with an array of structs, the compiler will have to make a superfluous copy of an entire struct in order to read an element thereof; a compiler-supported `ReadableArray` wouldn't have to have that problem. – supercat May 21 '12 at 23:01
  • @supercat "the compiler will have to make a superfluous copy of an entire struct in order to read an element thereof". No, the compiler can optimize that away. My garbage collected virtual machine with value types does. http://www.ffconsultancy.com/ocaml/hlvm/ – J D May 21 '12 at 23:26
  • @JonHarrop: One could certainly design a framework to avoid superfluous (and in many cases semantically-wrong) copies of value types, but I don't think one can have a read-only array wrapper *in .Net* without them. I've had some thoughts about what I'd like to see in a framework, if you'd like to discuss them in chat. – supercat May 22 '12 at 14:22

4 Answers4

11

Your accepted answer is wrong.

The difference between value types and reference types is primarily one of assignment semantics. Value types are copied on assignment - for a struct, that means copying the contents of all fields. Reference types only copy the reference, not the data. The stack is an implementation detail. The CLI spec promises nothing about where an object is allocated, and it's a bad idea to depend on behaviour that isn't in the spec.

Value types are characterised by their pass-by-value semantics but that does not mean they actually get copied by the generated machine code.

For example, a function that squares a complex number can accept the real and imaginary components in two floating point registers and return its result in two floating point registers. The code generator optimizes away all of the copying.

Several people had explained why this answer was wrong in comments below it but some moderator has deleted all of them.

Temporary objects (locals) will live in the GC generation 0. The GC is already smart enough to free them as soon as they go out of scope. You do not need to switch to struct instances for this.

This is complete nonsense. The GC sees only the information available at run-time, by which point all notions of scope have disappeared. The GC will not collect anything "as soon as it goes out of scope". The GC will collect it at some point after it has become unreachable.

Mutable value types already have a tendency to lead to bugs because it's hard to understand when you're mutating a copy vs. the original. But introducing reference properties on those value types, as would be the case with a fluent interface, is going to to be a mess, because it will appear that some parts of the struct are getting copied but others aren't (i.e. nested properties of reference properties). I can't recommend against this practice strongly enough, it's liable to lead to all kinds of maintenance headaches in the long haul.

Again, this is complete nonsense. There is nothing wrong with having references inside a value type.

Now, to answer your question:

Do all CLR value types, including user-defined structs, live on the evaluation stack exclusively, meaning that they will never need to be reclaimed by the garbage-collector, or are there cases where they are garbage-collected?

Value types certainly do not "live on the evaluation stack exclusively". The preference is to store them in registers. If necessary, they will be spilled to the stack. Sometimes they are even boxed on the heap.

For example, if you write a function that loops over the elements of an array then there is a good chance that the int loop variable (a value type) will live entirely in a register and never be spilled to the stack or written into the heap. This is what Eric Lippert (of the Microsoft C# team, who wrote of himself "I don’t know all the details" regarding .NET's GC) meant when he wrote that value types can be spilled to the stack when "the jitter chooses to not enregister the value". This is also true of larger value types (like System.Numerics.Complex) but there is a higher chance of larger value types not fitting in registers.

Another important example where value types do not live on the stack is when you're using an array with elements of a value type. In particular, the .NET Dictionary collection uses an array of structs in order to store the key, value and hash for each entry contiguously in memory. This dramatically improves memory locality, cache efficiency and, consequently, performance. Value types (and reified generics) are the reason why .NET is 17× faster than Java on this hash table benchmark.

I did a brief experiment to see what the differences are in the CIL generated...

CIL is a high-level intermediate language and, consequently, will not give you any information about register allocation and spilling to the stack and does not even give you an accurate picture of boxing. Looking at CIL can, however, let you see how the front-end C# or F# compiler boxes some value types as it translates even higher-level constructs like async and comprehensions into CIL.

For more information on garbage collection I highly recommend The Garbage Collection Handbook and The Memory Managment Reference. If you want a deep dive into the internal implementation of value types in VMs then I recommend reading the source code of my own HLVM project. In HLVM, tuples are value types and you can see the assembler generated and how it uses LLVM to keep the fields of value types in registers whenever possible and optimizes away unnecessary copying, spilling to the stack only when necessary.

J D
  • 48,105
  • 13
  • 171
  • 274
  • Do you have evidence for any of your claims, or anything of substance to refute the many links to Eric Lippert and other experts in that answer? And can you explain what the relevance of your second paragraph is to the question at hand? How your hypothetical function might be optimized would be entirely up to the jitter, so this sounds like speculation to me, and in any event, a "complex number" struct is not going to have any reference types or mutable properties attached to it - it's an atomic value, which is the whole point. – Aaronaught May 21 '12 at 19:47
  • 1
    You are also preposterously oversimplifying the GC. The GC is generational; objects in gen 0 will be collected much sooner than anything else, and temporary heap objects will invariably end up in gen 0 because of their scope. This sounds like the FUD I used to hear 5 years ago from people insisting that GC could never outperform explicit MM. Your entire argument is predicated on knowing the implementation details, so - while it's true that the GC's spec does not guarantee collection of gen0 objects as soon as they leave scope - practically speaking, that's exactly what happens. – Aaronaught May 21 '12 at 19:51
  • "it's true that the GC's spec does not guarantee collection of gen0 objects as soon as they leave scope - practically speaking, that's exactly what happens". As I already explained, the GC is not even aware of the existence of scope in the source language so your claim is obviously non-sensical. – J D Oct 20 '12 at 14:23
  • 1
    When GC kicks in, and scans the stack for alive objects it does know which local pointers contain live data and which pointers contain out of the scope data. So GC is scope aware. ..and by local pointers I mean local variables and registers – Panos Theof Sep 10 '13 at 14:32
  • @PanosTheof: "So GC is scope aware". That is incorrect. The GC is *not* scope aware. – J D Sep 10 '13 at 20:05
5

Please consider the following:

  1. The difference between value types and reference types is primarily one of assignment semantics. Value types are copied on assignment - for a struct, that means copying the contents of all fields. Reference types only copy the reference, not the data. The stack is an implementation detail. The CLI spec promises nothing about where an object is allocated, and it's ordinarily a dangerous idea to depend on behaviour that isn't in the spec.

  2. Temporary objects (locals) will live in the GC generation 0. The GC is already smart enough to free them (almost) as soon as they go out of scope - or whenever it is actually most efficient to do so. Gen0 runs frequently enough that you do not need to switch to struct instances for efficiently managing temporary objects.

  3. Mutable value types already have a tendency to lead to bugs because it's hard to understand when you're mutating a copy vs. the original. Many of the language designers themselves recommend making value types immutable whenever possible for exactly this reason, and the guidance is echoed by many of the top contributors on this site.

Introducing *reference properties* on those value types, as would be the case with a fluent interface, further violates the [Principle of Least Surprise][3] by creating inconsistent semantics. The expectation for value types is that they are copied, *in their entirety*, on assignment, but when reference types are included among their properties, you will actually only be getting a shallow copy. In the worst case you have a mutable struct containing *mutable* reference types, and the consumer of such an object will likely erroneously assume that one instance can be mutated without affecting the other.

There are always exceptions - [some of them in the framework itself][4] - but as a general rule of thumb, I would not recommend writing "optimized" code that (a) depends on private implementation details and (b) that you know will be difficult to maintain, *unless* you (a) have full control over the execution environment and (b) have actually profiled your code and verified that the optimization would make a significant difference in latency or throughput.
  1. The g_initLocal0 and related fields are there because you are using object initializers. Switch to parameterized constructors and you'll see those disappear.

Value types are typically allocated on the stack, and reference types are typically allocated on the heap, but that is not actually part of the .NET specification and is not guaranteed (in the first linked post, Eric even points out some obvious exceptions).

More importantly, it's simply incorrect to assume that the stack being generally cheaper than the heap automatically means that any program or algorithm using stack semantics will run faster or more efficiently than a GC-managed heap. There a number of papers written on this topic and it is entirely possible and often likely for a GC heap to outperform stack allocation with a large number of objects, because modern GC implementations are actually more sensitive to the number of objects that don't need freeing (as opposed to stack implementations which are entirely pinned to the number of objects on the stack).

In other words, if you've allocated thousands or millions of temporary objects - even if your assumption about value types having stack semantics holds true on your particular platform in your particular environment - utilizing it could still make your program slower!

Therefore I'll return to my original advice: Let the GC do its job, and don't assume that your implementation can outperform it without a full performance analysis under all possible execution conditions. If you start with clean, maintainable code, you can always optimize later; but if you write what you believe to be performance-optimized code at the cost of maintainability and later turn out to be wrong in your performance assumptions, the cost to your project will be far greater in terms of maintenance overhead, defect counts, etc.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Aaronaught
  • 120,909
  • 25
  • 266
  • 342
  • 2
    This answer contains at least one serious flaw: "Temporary objects (locals) will live in the GC generation 0. The GC is already smart enough to free them as soon as they go out of scope" Gen 0 objects still need to wait for a gen 0 collection, which will be frequent, but doesn't occur as soon as they go out of scope. – Robert May 21 '12 at 14:13
  • 5
    Why are valid comments being deleted? This is totally unacceptable. – hcoverlambda May 21 '12 at 14:36
  • @MikeOBrien: There may have been a few valid comments, but for the most part it had turned into an entire off-topic conversation, and that type of dialogue belongs in [chat], which the participants were refusing to take it to. Comments are for adding clarifications or in some cases for correcting minor errors; if you have *that* much more to say, submit your own answer. – Aaronaught May 21 '12 at 16:44
  • @Robert: I've corrected the part you were concerned about; however, I wouldn't have considered it a *serious* flaw, as it runs so frequently that it is practically instantaneous - and if it runs slightly *less* frequently then it will in all likelihood be *more efficient* because it can deallocate entire blocks at once instead of a tiny handful of individual objects. That is, I suppose, unless you're mixing allocations of temporary and long-lived objects in the same hot spot. – Aaronaught May 21 '12 at 17:18
4

It is a JIT compiler implementation detail where it will allocate the .locals. Right now, I don't know any that doesn't allocate them on a stack frame. They are "allocated" by adjusting the stack pointer and "freed" by resetting it back. Very fast, hard to improve. But who knows, 20 years from now we might all be running machines with CPU cores that are optimized to run only managed code with a completely different internal implementation. Probably cores with a ton of registers, the JIT optimizer already uses registers to store locals now.

The temporaries are emitted by the C# compiler to provide some minimum consistency guarantees in case object initializers throw exceptions. It prevents your code from ever seeing a partially initialized object in a catch or finally block. Also used in the using and lock statements, it prevents the wrong object from being disposed or unlocked if you replace the object reference in your code.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
1

Structures are value types and allocated on the stack when used for local variables. But if you cast a local variable to Object or an interface, the value is boxed and allocated on the heap.

In consequence structures are freed after they fall out of scope besides they are boxed and moved to the heap after that the garbage collector becomes responsible for freeing them when there is no longer any reference to the object.

I am not sure about the reason for all the compiler generated local variables but I assume they are used because you use object initializers. The objects are first initialized using a compiler generated local variable and only after complete execution of the object initializers copied to your local variable. This insures that you will never see an instance with only some of the object initializers executed.

Daniel Brückner
  • 59,031
  • 16
  • 99
  • 143
  • 2
    Value types are not guaranteed to be allocated on the stack. See [the stack is an implementation detail](http://blogs.msdn.com/ericlippert/archive/2009/04/27/the-stack-is-an-implementation-detail.aspx). – Aaronaught May 15 '10 at 13:50
  • Would it then be generally correct to say that the boxing and unboxing operations are fundamentally nothing else than moving value-type values between the stack and the heap? – stakx - no longer contributing May 15 '10 at 13:54
  • @stakx: **NO**, that is completely *incorrect*. Boxing a value type allows it to be passed around as a reference type so that copies of the object aren't made every time an assignment happens. The relationship to the stack and heap is **not your concern** as a programmer and the spec does **not make guarantees**. – Aaronaught May 15 '10 at 13:57
  • 1
    Of course it is only an implementation detail but the answer to the question depends on this implementation detail. – Daniel Brückner May 15 '10 at 13:58
  • *@Aaronaught:* That's a great blog post that you've linked to. I see now that my question in the comment above would have been unnecessary had I first read that blog entry. Thanks for the clarification! – stakx - no longer contributing May 15 '10 at 14:01
  • 2
    If you only look at the CLI specification you will not get an answer - the specification is intentionally vague in order to not impose any constraints on the implementation. – Daniel Brückner May 15 '10 at 14:02
  • *@Aaronaught* again: I think you'd deserve the mark for answering this question (if you happened to re-post as an answer what you've already said above). – stakx - no longer contributing May 15 '10 at 14:14
  • 2
    @Aaronaught "The relationship to the stack and heap is not your concern as a programmer". Unless you're interested in performance, in which case you'll want to learn about the performance characteristics of value and reference types. – J D May 14 '12 at 23:00