1

Considering the following struct :

    struct S
    {
        public string s;
    }

What is the difference between 1 :

    S instance = new S();
    instance.s = "foo";

and 2 :

    S instance;
    instance.s = "foo";

Both versions compile and run fine.
I am just curious to know what happens behind the scenes.

Edit :
I guess in case 2 S is unassigned until we put a value on its s field;
As this doesn't work :

 S instance;
 if (inst.s == null)
     inst.s = "foo";  //Compiler drops : Use of possibly unassigned field 's'

while this does :

 S instance;
 inst.s = "foo";
 if (inst.s == null)
     inst.s = "bar";  //Compiler drops : Use of possibly unassigned field 's'

and this also works :

 S inst = new S();     
 if (inst.s == null)
      inst.s = "foo";

I welcome any deeper explanations about this behavior

Update
I found those 2 posts, completing Marc's answer :
why are mutable structs evil
when to use struct in c#

Community
  • 1
  • 1
Mehdi LAMRANI
  • 11,289
  • 14
  • 88
  • 130

3 Answers3

5

What is the difference between

S instance = new S();     
instance.s = "foo"; 

and

S instance;     
instance.s = "foo";

?

As Marc correctly points out, both are equally bad; the right thing to do is to make an immutable struct that takes the string in its constructor.

And as Marc correctly points out, functionally there is no difference.

However, that does not answer the question that you actually asked, which is "what happens behind the scenes?" By "behind the scenes" I'm assuming that you're talking about the compiler's semantic analysis of the code, as described in the C# specification.

Fortunately the specification is extremely clear on the difference between these two cases.

First off, as you correctly note, in the first case the variable is considered to be definitely assigned after the first statement. In the second case the variable is not considered to be definitely assigned until after the second statement.

However, the definite assignment analysis is simply a consequence of the actual meaning of the code, which is as follows.

The first fragment:

  • allocates storage for instance
  • allocates temporary storage for the temporary value
  • initializes the temporary value to the default struct state
  • copies the bits of the temporary value to the storage for instance
  • now instance is definitely assigned because all its bits were copied from another value
  • copies the string reference into the storage for instance

The second fragment

  • allocates storage for instance
  • copies the string reference into the storage for instance
  • now instance is definitely assigned because all its fields have been assigned

The compiler is permitted to notice that there is no way to determine whether or not a temporary was created, initialized and copied. If it does determine that then it is permitted to elide the creation of the temporary, and generate the same code for both fragments. The compiler is not required to do so; this is an optimization and optimizations are never required.

Now, you might wonder under what circumstances you can determine that a temporary was created, initialized and copied. If you do wonder that then read my article on the subject:

https://ericlippert.com/2010/10/11/debunking-another-myth-about-value-types/

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
4

Functionally, nothing. Note that without the explicit assignment, the S instance (without a new()) isn't "definitely assigned", but: since (in your original question) you are (were) assigning the field, that doesn't matter. A struct in C# is definitely assigned if:

  • it is explicitly assigned a value, via an expression (perhaps a new())
  • all fields on the uninitialized struct are explicitly assigned

In reality, mutable structs are a really really bad idea unless you know exactly what you are doing (and why). Also, public fields are usually a bad idea two. Mix two bad ideas, for fun ;p

If you really want to use a struct here, my version would be:

S instance = new S("foo");

with:

struct S {
  private readonly string value;
  public string Value { get { return value; } }
  public S(string value) { this.value = value; }
  public override string ToString() { return value ?? ""; }
  public override int GetHashCode() {return value==null?0:value.GetHashCode();}
  public override bool Equals(object obj) { return value == ((S) obj).value; }
}
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Thank you Marc. What about partial initialization of "some"fields ? I tried it and saw no difference with explicit assignment while debugging. As of the readonly aspect making it immutable, I barely saw this respected on code I came across generally, I'll be cautious in that in the future. – Mehdi LAMRANI Feb 22 '12 at 15:15
  • Added an edit with links to SOF topics completing your answer – Mehdi LAMRANI Feb 22 '12 at 15:17
  • @MarcGravell: It makes me a little wary to use `==` for an `Equals` in this manner. It took me some effort to reason about your algorithm to be sure it was actually safe; there are odd cases where `==` between strings are not safe due to interning. They don't apply here because `Value` is always an unboxed `string`. I prefer obviously safe code. Thus, I prefer implementations of `Equals` to call `Equals` when working with reference types. – Brian Feb 22 '12 at 16:58
  • @Brian the compiler knows that as a string, so == is perfectly valid in all cases. The problem you describe is when it is only known as "object", when tht becomes ReferenceEquals. It is perfectly fine in this case. There is no such thing as neither a boxed nor unboxed string; strings are never boxed. – Marc Gravell Feb 22 '12 at 18:12
  • @MarcGravell: I agree that my terminology is wrong, but I still think my underlying complaint has substance. IMO, it smells to use `==` to compare reference types for equality, even though in this case you can get away with it. – Brian Feb 22 '12 at 18:26
  • @Brian "got away with it" suggests it was by chance; but no, it was by knowledge of the formal API. It is **required** to work the way I have used it. Do you honestly use Equals for all your string equality tests? That sounds... Unnecessarily hard to read. – Marc Gravell Feb 22 '12 at 19:36
1
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        struct S { public string s; } 
        static void Main(string[] args)
        {
            S instance = new S();
            instance.s = "foo";
            S instance1;
            instance1.s = "foo";
        }
    }
}

c:>ildasm c:\Blyme\ConsoleApplication1\bin\Debug\ConsoleApplication1.exe

gives me the following msil code (on Main method)

.method private hidebysig static void  'Main'(string[] 'args') cil managed
{
  .entrypoint
  // Code size       34 (0x22)
  .maxstack  2
  .locals init (valuetype 'ConsoleApplication1'.'Program'/'S' V_0,
           valuetype 'ConsoleApplication1'.'Program'/'S' V_1)
  IL_0000:  nop
  IL_0001:  ldloca.s   V_0
  IL_0003:  initobj    'ConsoleApplication1'.'Program'/'S'
  IL_0009:  ldloca.s   V_0
  IL_000b:  ldstr      "foo"
  IL_0010:  stfld      string 'ConsoleApplication1'.'Program'/'S'::'s'
  IL_0015:  ldloca.s   V_1
  IL_0017:  ldstr      "foo"
  IL_001c:  stfld      string 'ConsoleApplication1'.'Program'/'S'::'s'
  IL_0021:  ret
} // end of method 'Program'::'Main'

There is no difference in stack allocation (only 2), only an additional call on initobj. case 2 implies that there is no need for initobj and it does make sense given this is value type.

would you usually say

int a = 0; or int a = new int(0);

i guess the later looks more "cosmetically" correct

Krishna
  • 2,451
  • 1
  • 26
  • 31