15

I recently came across this Stackoverflow question: When to use struct?

In it, it had an answer that said something a bit profound:

In addition, realize that when a struct implements an interface - as Enumerator does - and is cast to that implemented type, the struct becomes a reference type and is moved to the heap. Internal to the Dictionary class, Enumerator is still a value type. However, as soon as a method calls GetEnumerator(), a reference-type IEnumerator is returned.

Exactly what does this mean?

If I had something like

struct Foo : IFoo 
{
  public int Foobar;
}

class Bar
{
  public IFoo Biz{get; set;} //assume this is Foo
}

...

var b=new Bar();
var f=b.Biz;
f.Foobar=123; //What would happen here
b.Biz.Foobar=567; //would this overwrite the above, or would it have no effect?
b.Biz=new Foo(); //and here!?

What exactly are the detailed semantics of a value-type structure being treated like a reference-type?

Community
  • 1
  • 1
Earlz
  • 62,085
  • 98
  • 303
  • 499
  • I think you answered this yourself - "the struct becomes a reference type **and is moved to the heap**" – Josh E Mar 04 '13 at 17:23
  • I don't know what a Foobar is in your example... – HackyStack Mar 04 '13 at 17:24
  • @JoshE so two references will point to the same thing? What if the value type changes? Is there any MSDN documentation or something explaining this? – Earlz Mar 04 '13 at 17:24
  • It is called boxing and lots of documention on MSDN. http://msdn.microsoft.com/en-us/library/yz2be5wk.aspx Have you tested the code you posted? – paparazzo Mar 04 '13 at 17:29
  • @Blam This doesn't appear to be "just" boxing. I mean, later there doesn't appear to be an unboxing and "the struct becomes a reference type and is moved to the heap" seems to indicate the unboxed type *becomes* a mutable boxed type – Earlz Mar 04 '13 at 17:33
  • 1
    think on this: if I try `Foo foo = b.Biz;` I get a 'Cannot implicitly convert type IFoo to Foo` compile error - you have to explicitly box/unbox, which is why and how `foo = null` since value types can't be `null` (pointer to value type on heap = null) – Josh E Mar 04 '13 at 17:35
  • @Earlz - two references could always point to the same object, sure. Just not in the examples you provided. in your example, `f.Foobar = 123` would result in a null reference exception since `b` won't have it's `Biz` field initialized. – Josh E Mar 04 '13 at 17:44
  • Did you test the code? It throws an error. – paparazzo Mar 04 '13 at 18:11
  • @Blam it was just trimmed code. I actually made it compile and tested it in my answer below. There is a huge difference between boxing/unboxing and just "using" the interface – Earlz Mar 04 '13 at 19:28
  • No, that is not trimmed code from your answer or you would not even have posted the question. That trimmed code leaves out the reason for the failure - have to explicitly unbox. – paparazzo Mar 04 '13 at 22:04
  • @Earlz: Struct members receive `this` a "byref" [the behind-the-scenes term for the thing passed by a `ref` parameter]. When a struct member is invoked on a boxed object, it behaves as though the boxed object had a field of the structure type, and passed that field, by `ref` as the structure's `this` parameter. Note that there is no way for a method on a boxed structure to get a reference to the object containing it. If such a method wants to pass its own instance to a method that takes an interface or other reference type, it must re-box. – supercat Mar 05 '13 at 17:41

4 Answers4

16

Every declaration of a structure type really declares two types within the Runtime: a value type, and a heap object type. From the point of view of external code, the heap object type will behave like a class with a fields and methods of the corresponding value type. From the point of view of internal code, the heap type will behave as though it has a field this of the corresponding value type.

Attempting to cast a value type to a reference type (Object, ValueType, Enum, or any interface type) will generate a new instance of its corresponding heap object type, and return a reference to that new instance. The same thing will happen if one attempts to store a value type into a reference-type storage location, or pass it as a reference-type parameter. Once the value has been converted to a heap object, it will behave--from the point of view of external code--as a heap object.

The only situation in which a value type's implementation of an interface may be used without the value type first being converted to a heap object is when it's passed as a generic type parameter which has the interface type as a constraint. In that particular situation, interface members may be used on the value type instance without its having to be converted to a heap object first.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • 7
    You should highlight the last paragraph somehow, it is very important. – Vincent Jul 21 '15 at 11:13
  • @supercat being a generic parameter is not the only scenario leading to unboxed interface calls. If some value type implements `IDisposable` and then is instantiated inside `using` statement (like `using (someclass.GetDisposableStruct()) { ; }`) the compiler is smart enough to emit `constrained.` CIL instruction. I guess the same happens with `IEnumerable.GetEnumerator()` when it is called on value types from within `foreach`. – Sergey.quixoticaxis.Ivanov Feb 08 '17 at 14:27
  • @supercat - are you saying, that in any case I access the struct by the way of the interface (in the given case IFoo), the struct is actually converted to a heap object? – jasnevo Jul 10 '17 at 08:03
  • @jasnevo: If you use a variable of interface type, as opposed to an interface-constrained generic, the variable will ever be able to hold `null` or a reference to an object stored on the heap. An attempt to store a struct into such a variable will create an object on the heap and store a reference to that object. – supercat Jul 10 '17 at 14:55
2

Read about boxing and unboxing (search the internet). For example MSDN: Boxing and Unboxing (C# Programming Guide).

See also the SO thread Why do we need boxing and unboxing in C#?, and the threads linked to that thread.

Note: It is not so important if you "convert" to a base class of the value type, as in

object obj = new Foo(); // boxing

or "convert" to an implemented interface, as in

IFoo iFoo = new Foo(); // boxing

The only base classes a struct has, are System.ValueType and object (including dynamic). The base classes of an enum type are System.Enum, System.ValueType, and object.

A struct can implement any number of interfaces (but it inherits no interfaces from its base classes). An enum type implements IComparable (non-generic version), IFormattable, and IConvertible because the base class System.Enum implements those three.

Community
  • 1
  • 1
Jeppe Stig Nielsen
  • 60,409
  • 11
  • 110
  • 181
  • 1
    It's important to note that storing a struct in an interface-type location will copy its values to a heap object that will behave with reference semantics, but passing it as a constrained generic will not. Some interfaces like `IEquatable` exist for the purpose of being invoked without boxing. – supercat May 14 '13 at 16:17
  • @supercat Agree. If I have `void MyMethod(T t) { ... }` then of course `t` is not boxed if `T` is a value type. And if I add a constraint like `where T : IFace`, it still won't box `t` of course. But if, with the constraint, inside the method body I say `t.MemberOfIFace();` which will be legal because of my constraint, it will have to lead to boxing _at that point_, I guess. – Jeppe Stig Nielsen May 14 '13 at 21:28
  • 2
    Nope. One of the really great thing about generics in .NET is that if one has a generic type parameter which is constrained to an interface, one may use interface members on things of that type without boxing. The just-in-time compiler will compile separate versions of a method for each different generic value type or combination thereof which is applicable to it. For any given struct type T, the JIT can determine what method should be invoked and generates code that calls that method directly without needing an object header for virtual dispatch. – supercat May 14 '13 at 21:42
  • I consider it unfortunate that there's no nice way to indicate syntactically that one wishes to use an interface member on a struct which is known to implement that interface without boxing it. If interface `IFoo` has member `int Bar(string)`, one could write a static method `static int CallIFooBar(ref T it, string param) where T:IFoo { return it.Bar(param); }` and use it to invoke interface method `Bar` without boxing, but that's rather clunky. – supercat May 14 '13 at 21:45
  • @supercat Sounds great. Then even if `MemberOfIFace` was written with explicit interface implementation, the method will be called without first "casting" (boxing) the `t` variable to type `IFace`? Cool. (I was referring to your second comment.) – Jeppe Stig Nielsen May 14 '13 at 21:51
  • @supercat I found this related thread: [Why does calling an explicit interface implementation on a value type cause it to be boxed?](http://stackoverflow.com/questions/5812099/) – Jeppe Stig Nielsen May 14 '13 at 22:01
1

I'm replying your post about your experiment on 2013-03-04, though I might be a bit late :)

Keep this in mind: Every time you assign a struct value to a variable of an interface type (or return it as an interface type) it will be boxed. Think of it like a new object (the box) will be created on the heap, and the value of the struct will be copied there. That box will be kept until you have a reference on it, just like with any object.

With behavior 1, you have the Biz auto property of type IFoo, so when you set a value here, it will be boxed and the property will keep a reference to the box. Whenever you get the value of the property, the box will be returned. This way, it mostly works as if Foo would be a class, and you get what you expect: you set a value and you get it back.

Now, with behavior 2, you store a struct (field tmp), and your Biz property returns its value as an IFoo. That means every time get_Biz is called, a new box will be created and returned.

Look through the Main method: every time you see a b.Biz, that's a different object (box). That will explain the actual behavior.

E.g. in line

    b.Biz.Foobar=567;

b.Biz returns a box on the heap, you set the Foobar in it to 576 and then, as you do not keep a reference to it, it is lost immediatly for your program.

In the next line you writeline b.Biz.Foobar, but this call to b.Biz will then again create a quite new box with Foobar having the default 0 value, that's what printed.

Next line, variable f earlier was also filled by a b.Biz call which created a new box, but you kept a reference for that (f) and set its Foobar to 123, so that's still what you have in that box for the rest of the method.

dzs
  • 413
  • 4
  • 9
0

So, I decided to put this behavior to the test myself. I'll give the "results", but I can't explain why things happen this way. Hopefully someone with more knowledge about how this works can come along and enlighten me with a more thorough answer

Full test program:

using System;

namespace Test
{
    interface IFoo
    {
        int Foobar{get;set;}
    }
    struct Foo : IFoo 
    {
        public int Foobar{ get; set; }
    }

    class Bar
    {
        Foo tmp;
        //public IFoo Biz{get;set;}; //behavior #1
        public IFoo Biz{ get { return tmp; } set { tmp = (Foo) value; } } //behavior #2

        public Bar()
        {
            Biz=new Foo(){Foobar=0};
        }
    }


    class MainClass
    {
        public static void Main (string[] args)
        {
            var b=new Bar();
            var f=b.Biz;
            f.Foobar=123; 
            Console.WriteLine(f.Foobar); //123 in both
            b.Biz.Foobar=567; /
            Console.WriteLine(b.Biz.Foobar); //567 in behavior 1, 0 in 2
            Console.WriteLine(f.Foobar); //567 in behavior 1, 123 in 2
            b.Biz=new Foo();
            b.Biz.Foobar=5;
            Console.WriteLine(b.Biz.Foobar); //5 in behavior 1, 0 in 2
            Console.WriteLine(f.Foobar); //567 in behavior 1, 123 in 2
        }
    }
}

As you can see, by manually boxing/unboxing we get extremely different behavior. I don't completely understand either behavior though.

Earlz
  • 62,085
  • 98
  • 303
  • 499
  • In 99% of cases, you can predict what the code will do if you figure that variables of the heap type will hold references to heap instances, and casting a reference of one heap type to a reference to another will yield a reference to the *same* instance; casting value type to the heap type or vice versa will make a copy of the data in question. – supercat Mar 05 '13 at 17:20