77

In C#, Structs are managed in terms of values, and objects are in reference. From my understanding, when creating an instance of a class, the keyword new causes C# to use the class information to make the instance, as in below:

class MyClass
{
    ...
}
MyClass mc = new MyClass();

For struct, you're not creating an object but simply set a variable to a value:

struct MyStruct
{
    public string name;
}
MyStruct ms;
//MyStruct ms = new MyStruct();     
ms.name = "donkey";

What I do not understand is if declare variables by MyStruct ms = new MyStruct(), what is the keyword new here is doing to the statement? . If struct cannot be an object, what is the new here instantiating?

jbat100
  • 16,757
  • 4
  • 45
  • 70
KMC
  • 19,548
  • 58
  • 164
  • 253
  • 3
    An instance of a `struct` *is* an object. The distinction you are probably misunderstanding is that between value types and reference types. – Ed S. Feb 09 '12 at 08:40
  • but in C there is no object and struct is not an object. So in C# struct is implemented as object? – KMC Feb 12 '12 at 08:15
  • 1
    Thinking of C# in terms of C is not helpful. Ignore the syntactical differences, they are completely different languages. – Ed S. Feb 12 '12 at 09:29
  • 1
    @KMC Even in C there is an object. You misunderstand what “object” means – understandable, since it means many different things in different contexts. In C++ (and I think C is similar) for instance it’s simply a space in memory: everything that resides in memory is an object. – Konrad Rudolph Jun 09 '13 at 12:51
  • Related Answer: http://stackoverflow.com/a/3943596/380384. When not to initialize `struct` with `new`. – John Alexiou Aug 22 '14 at 14:51

6 Answers6

68

From struct (C# Reference) on MSDN:

When you create a struct object using the new operator, it gets created and the appropriate constructor is called. Unlike classes, structs can be instantiated without using the new operator. If you do not use new, the fields will remain unassigned and the object cannot be used until all of the fields are initialized.

To my understanding, you won't actually be able to use a struct properly without using new unless you make sure you initialise all the fields manually. If you use the new operator, then a properly-written constructor has the opportunity to do this for you.

Hope that clears it up. If you need clarification on this let me know.


Edit

There's quite a long comment thread, so I thought I'd add a bit more here. I think the best way to understand it is to give it a go. Make a console project in Visual Studio called "StructTest" and copy the following code into it.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace struct_test
{
    class Program
    {
        public struct Point
        {
            public int x, y;

            public Point(int x)
            {
                this.x = x;
                this.y = 5;
            }

            public Point(int x, int y)
            {
                this.x = x;
                this.y = y;
            }

            // It will break with this constructor. If uncommenting this one
            // comment out the other one with only one integer, otherwise it
            // will fail because you are overloading with duplicate parameter
            // types, rather than what I'm trying to demonstrate.
            /*public Point(int y)
            {
                this.y = y;
            }*/
        }

        static void Main(string[] args)
        {
            // Declare an object:
            Point myPoint;
            //Point myPoint = new Point(10, 20);
            //Point myPoint = new Point(15);
            //Point myPoint = new Point();


            // Initialize:
            // Try not using any constructor but comment out one of these
            // and see what happens. (It should fail when you compile it)
            myPoint.x = 10;
            myPoint.y = 20;

            // Display results:
            Console.WriteLine("My Point:");
            Console.WriteLine("x = {0}, y = {1}", myPoint.x, myPoint.y);

            Console.ReadKey(true);
        }
    }
}

Play around with it. Remove the constructors and see what happens. Try using a constructor that only initialises one variable(I've commented one out... it won't compile). Try with and without the new keyword(I've commented out some examples, uncomment them and give them a try).

StayOnTarget
  • 11,743
  • 10
  • 52
  • 81
joshhendo
  • 1,964
  • 1
  • 21
  • 28
  • "structs can be instantiated"? but struct cannot be an object, isn't it? Are the "fields" in the struct the properties and methods - if struct is not an object why does its fields need to be initialized? I think I need more clarification. thanks. – KMC Feb 09 '12 at 08:34
  • Why *wouldn't* the struct's fields need to be initialised, if you don't call a constructor? If you don't initialise them, and you don't call a constructor to initialise them, they remain uninitialised. –  Feb 09 '12 at 08:37
  • A struct is an object of sorts. Described on the Microsoft website as a "lightweight object." It can have variables, but not functions. A struct is sort of like a class, but all the members are public and you can't have any functions. It allows you to store information, but you can't manipulate or control that information like you can in a class. You could make a "new" struct to use the same variable but clear all the data. – joshhendo Feb 09 '12 at 08:40
  • 8
    @joshhendo: Huh? An instance of a struct is an object and they can certainly contain methods and private fields. You are confusing a beginner here. – Ed S. Feb 09 '12 at 08:42
  • The terminology may be a bit confusing. Structs were originally around in C, which isn't an object orientated language. They still exist in C++ and other C-like languages (which C# is.) It may be better to think of a structure as a "record," or set of related variables grouped together. It's simply a condition of C# structs that all the variables inside a struct need to be initialised before the struct can be used. You can do that manually, or using the new operator. It may be better to think of structs as a "lightweight object." – joshhendo Feb 09 '12 at 08:46
  • `If you use the new operator, the constructor will do this for you.` I don't think last statement is completely true. Constructor won't initialize member variables for you. You would have to do it by yourself in constructor. Please correct me if I am wrong here – Haris Hasan Feb 09 '12 at 08:48
  • That may be correct, I was making the assumption that for a struct this generally would be done. I guess it wouldn't initialise every variable if you didn't want it to, but from what I can see on the MS website, it would then consider the strut to be "initialised." I'll double check this and edit my answer to reflect what I find. – joshhendo Feb 09 '12 at 08:54
  • Haris: I've double checked, and you won't be able to compile a program that has a constructor for a struct that doesn't assign a value to each variable. So the statement is correct. The main benefit of using a constructor is that you could have default values if you wanted and overload the constructor and that it may look neater, but for anything more complex than that it would be better to use a class. – joshhendo Feb 09 '12 at 08:59
  • That's right for the case where you define your own constructor. But what if you use the default constructor? Statement (If you use the new operator, the constructor will do this for you) won't remain correct in that case. That was my point – Haris Hasan Feb 09 '12 at 09:02
  • Ahh, sorry about that. Yes, you're right, just tested it. So basically, if you use the new operator on a non-default constructor, you're guaranteed to have values, though if you use it with the default constructor, you're effectively just clearing it out (I'm hesitant to say make a new one since that's the word I'm explaining.) Garbage collection should get rid of the old instance. – joshhendo Feb 09 '12 at 09:10
  • Actually, using the default constructor will initialise int's to zero (I'm presuming it will assign values to either null or 0). If you do not use any constructor (so have Point myPoint) and then only assign a value to one variable (out of at least 2) you will get a compile time error. For example, in my example above if I only assign a value to 'x' I get "Error: Use of possibly unassigned field 'y'" when I try and compile. – joshhendo Feb 09 '12 at 09:13
21

Catch Eric Lippert's excellent answer from this thread. To quote him:

When you "new" a value type, three things happen. First, the memory manager allocates space from short term storage. Second, the constructor is passed a reference to the short term storage location. After the constructor runs, the value that was in the short-term storage location is copied to the storage location for the value, wherever that happens to be. Remember, variables of value type store the actual value.

(Note that the compiler is allowed to optimize these three steps into one step if the compiler can determine that doing so never exposes a partially-constructed struct to user code. That is, the compiler can generate code that simply passes a reference to the final storage location to the constructor, thereby saving one allocation and one copy.)

(Making this answer since it really is one)

Community
  • 1
  • 1
nawfal
  • 70,104
  • 56
  • 326
  • 368
  • It's worthwhile to note that because `out` parameters are a C# concept, rather than one used by the .NET Runtime, passing a partially-constructed struct as an `out` parameter to an external method will expose its values to outside code, even though the C# compiler will assume it won't. One could, for example, define a struct in such a way that `myThing = newmyThing(5);` will initialize one field of `myThing` while leaving the others unaffected. – supercat Jun 09 '13 at 15:48
  • 3
    I perceive that different language groups probably have their own vision of what .NET should be, and pretend that it fits their vision. For example, the C# group probably figures that .NET *should* have enforceable `out` parameters, and if everyone programmed in C# it would, but a virtual method with an `out` parameter will be regarded by other languages as a virtual method with a `ref` parameter. In some cases it's nice that languages aren't limited to the minimal subset of features that other language implementers may want to implement, but there are dangers too. – supercat Jun 17 '13 at 22:02
  • 1
    This should be the answer – Monku May 05 '18 at 22:45
5

In a struct, the new keyword is needlessly confusing. It doesn't do anything. It's just required if you want to use the constructor. It does not perform a new.

The usual meaning of new is to allocate permanent storage (on the heap.) A language like C++ allows new myObject() or just myObject(). Both call the same constructor. But the former creates a new object and returns a pointer. The latter merely creates a temp. Any struct or class can use either. new is a choice, and it means something.

C# doesn't give you a choice. Classes are always in the heap, and structs are always on the stack. It isn't possible to perform a real new on a struct. Experienced C# programmers are used to this. When they see ms = new MyStruct(); they know to ignore the new as just syntax. They know it's acting like ms = MyStruct(), which merely assigns to an existing object.

Oddly(?), classes require the new. c=myClass(); isn't allowed (using the constructor to set values of existing object c.) You'd have to make something like c.init();. So you really never have a choice -- constructors always allocate for classes, and never for structs. The new is always just decoration.

I assume the reason for requiring fake new's in structs is so you can easily change a struct into a class (assuming you always use myStruct=new myStruct(); when you first declare, which is recommended.)

Owen Reynolds
  • 235
  • 2
  • 3
  • 1
    Thinking in terms of implementation details is wrong, structs aren't always allocated on the stack, e.g. struct fields on a class, boxing, etc... and new actually does something! – iam3yal Jun 07 '16 at 18:06
  • 2
    The confusion is that people are reading this question as "why should I use new with structs." But if you read it, the question is really that last phrase "what is the new here instantiating?" It's a question about why they choose that funny syntax. – Owen Reynolds Jun 08 '16 at 22:44
3

Using "new MyStuct()" ensures that all fields are set to some value. In the case above, nothing is different. If instead of setting ms.name you where trying to read it, you would get a "Use of possible unassigned field 'name'" error in VS.

Daryl
  • 3,253
  • 4
  • 29
  • 39
3

Any time an object or struct comes into existence, all of its fields come into existence as well; if any of those fields are struct types, all nested fields come into existence as well. When an array is created, all of its elements come into existence (and, as above, if any of those elements are structs, the fields of those structs also come into existence). All of this occurs before any constructor code has a chance to run.

In .net, a struct constructor is effectively nothing more than a method which takes a struct as an 'out' parameter. In C#, an expression which calls a struct constructor will allocate a temporary struct instance, call the constructor on that, and then use that temporary instance as the value of the expression. Note that this is different from vb.net, where the generated code for a constructor will start by zeroing out all fields, but where the code from the caller will attempt to have the constructor operate directly upon the destination. For example: myStruct = new myStructType(whatever) in vb.net will clear myStruct before the first statement of the constructor executes; within the constructor, any writes to the object under construction will immediately operate upon myStruct.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

ValueType and structures are something special in C#. Here I'm showing you what happens when you new something.

Here we have the following

  • Code

    partial class TestClass {
        public static void NewLong() {
            var i=new long();
        }
    
        public static void NewMyLong() {
            var i=new MyLong();
        }
    
        public static void NewMyLongWithValue() {
            var i=new MyLong(1234);
        }
    
        public static void NewThatLong() {
            var i=new ThatLong();
        }
    }
    
    [StructLayout(LayoutKind.Sequential)]
    public partial struct MyLong {
        const int bits=8*sizeof(int);
    
        public static implicit operator int(MyLong x) {
            return (int)x.m_Low;
        }
    
        public static implicit operator long(MyLong x) {
            long y=x.m_Hi;
            return (y<<bits)|x.m_Low;
        }
    
        public static implicit operator MyLong(long x) {
            var y=default(MyLong);
            y.m_Low=(uint)x;
            y.m_Hi=(int)(x>>bits);
            return y;
        }
    
        public MyLong(long x) {
            this=x;
        }
    
        uint m_Low;
        int m_Hi;
    }
    
    public partial class ThatLong {
        const int bits=8*sizeof(int);
    
        public static implicit operator int(ThatLong x) {
            return (int)x.m_Low;
        }
    
        public static implicit operator long(ThatLong x) {
            long y=x.m_Hi;
            return (y<<bits)|x.m_Low;
        }
    
        public static implicit operator ThatLong(long x) {
            return new ThatLong(x);
        }
    
        public ThatLong(long x) {
            this.m_Low=(uint)x;
            this.m_Hi=(int)(x>>bits);
        }
    
        public ThatLong() {
            int i=0;
            var b=i is ValueType;
        }
    
        uint m_Low;
        int m_Hi;
    }
    

And the generated IL of the methods of the test class would be

  • IL

    // NewLong
    .method public hidebysig static 
        void NewLong () cil managed 
    {
        .maxstack 1
        .locals init (
            [0] int64 i
        )
    
        IL_0000: nop
        IL_0001: ldc.i4.0 // push 0 as int
        IL_0002: conv.i8  // convert the pushed value to long
        IL_0003: stloc.0  // pop it to the first local variable, that is, i
        IL_0004: ret
    } 
    
    // NewMyLong
    .method public hidebysig static 
        void NewMyLong () cil managed 
    {
        .maxstack 1
        .locals init (
            [0] valuetype MyLong i
        )
    
        IL_0000: nop
        IL_0001: ldloca.s i     // push address of i
        IL_0003: initobj MyLong // pop address of i and initialze as MyLong
        IL_0009: ret
    } 
    
    // NewMyLongWithValue 
    .method public hidebysig static 
        void NewMyLongWithValue () cil managed 
    {
        .maxstack 2
        .locals init (
            [0] valuetype MyLong i
        )
    
        IL_0000: nop
        IL_0001: ldloca.s i  // push address of i
        IL_0003: ldc.i4 1234 // push 1234 as int
        IL_0008: conv.i8     // convert the pushed value to long
    
        // call the constructor
        IL_0009: call instance void MyLong::.ctor(int64) 
    
        IL_000e: nop
        IL_000f: ret
    } 
    
    // NewThatLong
    .method public hidebysig static 
        void NewThatLong () cil managed 
    {
        // Method begins at RVA 0x33c8
        // Code size 8 (0x8)
        .maxstack 1
        .locals init (
            [0] class ThatLong i
        )
    
        IL_0000: nop
    
        // new by calling the constructor and push it's reference
        IL_0001: newobj instance void ThatLong::.ctor() 
    
        // pop it to the first local variable, that is, i
        IL_0006: stloc.0
    
        IL_0007: ret
    } 
    

The behaviour of the methods are commented in the IL code. And you might want to take a look of OpCodes.Initobj and OpCodes.Newobj. The value type is usually initialized with OpCodes.Initobj, but as MSDN says OpCodes.Newobj would also be used.

  • description in OpCodes.Newobj

    Value types are not usually created using newobj. They are usually allocated either as arguments or local variables, using newarr (for zero-based, one-dimensional arrays), or as fields of objects. Once allocated, they are initialized using Initobj. However, the newobj instruction can be used to create a new instance of a value type on the stack, that can then be passed as an argument, stored in a local, and so on.

For each value type which is numeric, from byte to double, has a defined op-code. Although they are declared as struct, there's some difference in the generated IL as shown.

Here are two more things to mention:

  1. ValueType itself is declared a abstract class

    That is, you cannot new it directly.

  2. structs cannot contain explicit parameterless constructors

    That is, when you new a struct, you would fall into the case above of either NewMyLong or NewMyLongWithValue.

To summarize, new for the value types and structures are for the consistency of the language concept.

Ken Kin
  • 4,503
  • 3
  • 38
  • 76