44

I am new to C#, from a C++ background. In C++ you can do this:

class MyClass{
....
};
int main()
{
   MyClass object; // this will create object in memory
   MyClass* object = new MyClass(); // this does same thing
}

Whereas, in C#:

class Program
{
    static void Main(string[] args)
    {
        Car x;
        x.i = 2;
        x.j = 3;
        Console.WriteLine(x.i);
        Console.ReadLine();

    }
}
class Car
{
    public int i;
    public int j;


}

you can't do this. I wonder why Car x won't do its work.

svick
  • 236,525
  • 50
  • 385
  • 514
user6528398
  • 491
  • 1
  • 4
  • 4
  • 3
    That's how C# is. You need `new` keyword to create a new instance of the class, so it would be `Car x=new Car();` – Pikoh Jun 29 '16 at 12:30
  • 7
    I've often said that it's easier to go from C++ to either C# or Java, than the other way around. With solid understanding of C++ fundamentals, picking up C# or Java is fairly straightforward. But both of these do so much stuff under the covers, and which is not typically explained very well in traditional resources that teach C# and Java, that when going from either of those two to C++ it's always a very, very frustrating experience. – Sam Varshavchik Jun 29 '16 at 12:35
  • 81
    Btw, `Myclass object;` does something different than `Myclass* object = new MyClass();` – MaciekGrynda Jun 29 '16 at 12:37
  • @MaciekGrynda what is the difference? – user6528398 Jun 29 '16 at 12:42
  • 1
    I had [a similar question about Java](http://stackoverflow.com/questions/6340535/is-the-new-keyword-in-java-redundant) a while back. In this case I really think it is cosmetic. In C#, there it does make a difference. – juanchopanza Jun 29 '16 at 12:49
  • 28
    @user6528398 First one creates object on stack, and you don't have to free it's memory (destruct is called at the end of scope), second one creates object on heap, and you have to free memory yourself. – MaciekGrynda Jun 29 '16 at 12:54
  • 1
    because C# `MyClass o = null; o = new MyClass(5); int k = o.myField;` is 100% equivalent to `MyClass *o = null; o = new MyClass(5); int k = o->myField;` in C++ : i.e. the philosophy of C# (or Java or even Python or Javascript) is to make everything pointer to object, with the possibly (langage dependent) exception for basic types `int` , `float`, `bool` etc.. Once this is said, all you need is to talk about the garbage collector, and voila, you know C#, Java, etc. – reuns Jun 29 '16 at 19:24
  • 23
    @user6528398 If you think the two C++ lines are equivalent, your code will have a massive amount of memory leaks. The difference is so fundamental and important in C++ that I would tone down the _I have a background in C++_ part - you don't. – pipe Jun 29 '16 at 19:25
  • The simplest way to look at it, without delving into the nitty-gritty of it, is that in languages with automatic memory management, like C# or Java, most variables are secretly pointers to the actual instance, with the actual instance on the heap; this is done so that the memory management system can keep track of the variable, and handle it appropriately. In that case, it makes perfect sense from a C++ standpoint to use `new`, because for any type `T`, where `T` is non-primitive, `T tInstance` in C# or Java would translate to `T* tInstance` in C++. – Justin Time - Reinstate Monica Jun 29 '16 at 19:25
  • 4
    I also can't do `Car x; x.j = 3; Console.WriteLine(x.i);` in Haskell for some reason. – user253751 Jun 29 '16 at 22:03
  • I would think the easiest way to think of this is: You need a new car because the old car doesn't work. ;-P Really though, in its simplest terms, the new command gives you a new, separate instance of the class you are asking for. There are differences between C# and C++ but basically, in a general sense, this is what is happening. – Mark Manning Jun 30 '16 at 01:47

6 Answers6

64

There are a lot of misconceptions here, both in the question itself and in the several answers.

Let me begin by examining the premise of the question. The question is "why do we need the new keyword in C#?" The motivation for the question is this fragment of C++:

 MyClass object; // this will create object in memory
 MyClass* object = new MyClass(); // this does same thing

I criticize this question on two grounds.

First, these do not do the same thing in C++, so the question is based on a faulty understanding of the C++ language. It is very important to understand the difference between these two things in C++, so if you do not understand very clearly what the difference is, find a mentor who can teach you how to know what the difference is, and when to use each.

Second, the question presupposes -- incorrectly -- that those two syntaxes do the same thing in C++, and then, oddly, asks "why do we need new in C#?" Surely the right question to ask given this -- again, false -- presupposition is "why do we need new in C++?" If those two syntaxes do the same thing -- which they do not -- then why have two syntaxes in the first place?

So the question is both based on a false premise, and the question about C# does not actually follow from the -- misunderstood -- design of C++.

This is a mess. Let's throw out this question and ask some better questions. And let's ask the question about C# qua C#, and not in the context of the design decisions of C++.

What does the new X operator do in C#, where X is a class or struct type? (Let's ignore delegates and arrays for the purposes of this discussion.)

The new operator:

  • Causes a new instance of the given type to be allocated; new instances have all their fields initialized to default values.
  • Causes a constructor of the given type to be executed.
  • Produces a reference to the allocated object, if the object is a reference type, or the value itself if the object is a value type.

All right, I can already hear the objections from C# programmers out there, so let's dismiss them.

Objection: no new storage is allocated if the type is a value type, I hear you say. Well, the C# specification disagrees with you. When you say

S s = new S(123);

for some struct type S, the spec says that new temporary storage is allocated on the short-term pool, initialized to its default values, the constructor runs with this set to refer to the temp storage, and then the resulting object is copied to s. However, the compiler is permitted to use a copy-elision optimization provided that it can prove that it is impossible for the optimization to become observed in a safe program. (Exercise: work out under what circumstances a copy elision cannot be performed; give an example of a program that would have different behaviours if elision was or was not used.)

Objection: a valid instance of a value type can be produced using default(S); no constructor is called, I hear you say. That's correct. I didn't say that new is the only way to create an instance of a value type.

In fact, for a value type new S() and default(S) are the same thing.

Objection: Is a constructor really executed for situations like new S(), if not present in the source code in C# 6, I hear you say. This is an "if a tree falls in the forest and no one hears it, does it make a sound?" question. Is there a difference between a call to a constructor that does nothing, and no call at all? This is not an interesting question. The compiler is free to elide calls that it knows do nothing.

Suppose we have a variable of value type. Must we initialize the variable with an instance produced by new?

No. Variables which are automatically initialized, such as fields and array elements, will be initialized to the default value -- that is, the value of the struct where all the fields are themselves their default values.

Formal parameters will be initialized with the argument, obviously.

Local variables of value type are required to be definitely assigned with something before the fields are read, but it need not be a new expression.

So effectively, variables of value type are automatically initialized with the equivalent of default(S), unless they are locals?

Yes.

Why not do the same for locals?

Use of an uninitialized local is strongly associated with buggy code. The C# language disallows this because doing so finds bugs.

Suppose we have a variable of reference type. Must we initialize S with an instance produced by new?

No. Automatic-initialization variables will be initialized with null. Locals can be initialized with any reference, including null, and must be definitely assigned before being read.

So effectively, variables of reference type are automatically initialized with null, unless they are locals?

Yes.

Why not do the same for locals?

Same reason. A likely bug.

Why not automatically initialize variables of reference type by calling the default constructor automatically? That is, why not make R r; the same as R r = new R();?

Well, first of all, many types do not have a default constructor, or for that matter, any accessible constructor at all. Second, it seems weird to have one rule for an uninitialized local or field, another rule for a formal, and yet another rule for an array element. Third, the existing rule is very simple: a variable must be initialized to a value; that value can be anything you like; why is the assumption that a new instance is desired warranted? It would be bizarre if this

R r;
if (x) r = M(); else r = N();

caused a constructor to run to initialize r.

Leaving aside the semantics of the new operator, why is it necessary syntactically to have such an operator?

It's not. There are any number of alternative syntaxes that could be grammatical. The most obvious would be to simply eliminate the new entirely. If we have a class C with a constructor C(int) then we could simply say C(123) instead of new C(123). Or we could use a syntax like C.construct(123) or some such thing. There are any number of ways to do this without the new operator.

So why have it?

First, C# was designed to be immediately familiar to users of C++, Java, JavaScript, and other languages that use new to indicate new storage is being initialized for an object.

Second, the right level of syntactic redundancy is highly desirable. Object creation is special; we wish to call out when it happens with its own operator.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • 2
    "Object creation is special". I don't know, object creation doesn't feel so special nowadays. Great answer nonetheless. – Mephy Jun 30 '16 at 01:38
  • 2
    "In C# 6, you may define a default constructor that is called when `new S()` is invoked, but not when `default(S)` is used." - That feature was dropped in the end - because it caused too many headaches, from what I remember. (You can still do it in IL of course, and the situations in which it's invoked aren't always obvious...) – Jon Skeet Jun 30 '16 at 06:16
  • 2
    Another thing that might be useful to note is that due to C# naming conventions as they stand right now, if we didn't have `new` it would not be very easy to tell the difference between using a constructor and invoking a method that is within the same scope as the caller. Having `new` helps disambiguate that instead of necessitating the need to fully qualify the method call. – Dan Jun 30 '16 at 10:18
  • 3
    -1. @Eric Rather than criticize the OP's question and say "get a mentor" to say what the difference in the syntax is, why don't you explain it for the audience? You obviously seem to know. To me, you CAN reference a class as an object in both syntaxes, and that is what both of the statements in the OP's question do. Seemed a perfectly valid question to me, and your lengthy explanation did not address this point, to me. "WHY have 2 different syntaxes for something you can do in both languages?" is a legitimate question. – vapcguy Jul 01 '16 at 16:26
  • 1
    @JonSkeet: Thanks for the note; plainly I am very behind on my reading the notes. :) – Eric Lippert Jul 01 '16 at 16:29
  • 6
    @vapcguy: Because the question is about the design of the C# language, not a request for a tutorial to disabuse the original poster of their false beliefs about C++. I'm trying to answer the question that was asked here, which is about the design of the `new` operator and how it relates to variable declarations. A lengthy tutorial on how storage lifetime works in C++ would be off topic. – Eric Lippert Jul 01 '16 at 16:32
  • 1
    @EricLippert Actually, that seemed to be exactly where the answer should go, at least to me. You use `new` in order to set the memory space. But C++ does the same thing without it, doesn't it? So is there really a difference? And where you said "many types do not have a default constructor, or for that matter, any accessible constructor at all", it seems to bode more for not having the `new` syntax, than supporting it. I did like it that you said adding `new` makes it similar to other languages, like Java and JavaScript, that use `new` and want to emphasize it is setting "new" memory. – vapcguy Jul 05 '16 at 16:14
  • @vapcguy: Well then rather than leaving comments here, I encourage you to write your own answer that explains things in a way you like better. There is plenty of opportunity here for a diversity of answers. – Eric Lippert Jul 05 '16 at 17:29
  • @EricLippert Thanks for your response. Between your answer relating the `new` keyword to Java and JavaScript, Dan Pantry's response above and zacaj's answer about solving ambiguity problems, and Peter Respondek's post on how you can write code that will compile, but crash, I think my viewpoint got covered. – vapcguy Jul 05 '16 at 19:55
32

In C# you can do the similar thing:

  // please notice "struct"
  struct MyStruct {
    ....
  }

  MyStruct sample1; // this will create object on stack
  MyStruct sample2 = new MyStruct(); // this does the same thing

Recall that primitives like int, double, and bool are also of type struct, so even though it's conventional to write

  int i;

we may also write

  int i = new int(); 

unlike C++, C# doesn't use pointers (in the safe mode) to instances, however C# has class and struct declarations:

  • class: you have reference to instance, memory is allocated on heap, new is mandatory; similar to MyClass* in C++

  • struct: you have value, memory is (usually) allocated on stack, new is optional; similar to MyClass in C++

In your particular case you can just turn Car into struct

struct Car
{
    public int i;
    public int j;
}

and so the fragment

Car x; // since Car is struct, new is optional now 
x.i = 2;
x.j = 3;

will be correct

8protons
  • 3,591
  • 5
  • 32
  • 67
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • 35
    I'd argue against the usual misconception that is continously being thrown around and is creating an amazingly generalized false belief: classes go on the heap, structs go on the stack. That is absolutely false. The decision of what goes on the stack and what goes on the heap depends on the expected lifetime of the object, not its nature; a struct that can't be verified to be short lived will go on the heap: `class myClass { int i = 1; //this will go on the heap, not the stack. }`. – InBetween Jun 29 '16 at 13:03
  • 5
    @InBetween: Thank you! I see https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/ – Dmitry Bychenko Jun 29 '16 at 13:07
  • Can you explain me what is the difference between storing on heap and stack? As far as I know stack http://www.cplusplus.com/reference/stack/stack/ is this. I don't know what it is in terms of memory – user6528398 Jun 29 '16 at 13:28
  • Do not mix *collection type* (array, list, *stack*, dictionary...) and *memory organization*; http://www.c-sharpcorner.com/article/C-Sharp-heaping-vs-stacking-in-net-part-i/ – Dmitry Bychenko Jun 29 '16 at 13:41
  • @DmitryBychenko Here it says that string is reference type just like class. How can we declare strings just by string s; ( and not string s = new string() ) – user6528398 Jun 29 '16 at 14:23
  • @user6528398: you can *declare* by just `MyClass sample;` but if you want to *create* a class instance you have to put `new MyClass()`. In case of `String` it should have been something like `String test = new String(new char[] {'h', 'e', 'l', 'l', 'o'});`. Hopefully, in case of `String`, we have a *syntax sugar* `String test = "hello";` – Dmitry Bychenko Jun 29 '16 at 14:29
  • But you can put string s; s="Hello"; or you must put string s="Hello"; – user6528398 Jun 29 '16 at 14:30
  • 1
    There is a difference between `int i;` inside a type and `int i;` inside a method. The former initializes the variable to the default value (0 for int), the latter leaves it uninitialized and thus unreadable. – svick Jun 29 '16 at 17:13
  • 10
    I really don't like this answer - it has too many shaky assumptions. I think the right answer should be "in C#, unless *you know what you are doing*, you shouldn't really care about memory management." – Leonardo Herrera Jun 29 '16 at 17:38
  • 3
    I would suggest that the closest C++ analog to `class foo {...}` would be `typedef class {...} *foo`; one needs `new` for the same reason as one would need it in C++ (with that substitution), though the garbage-collector means that the destruction of the last extant pointer to an object will free the storage occupied thereby without having to use `delete`. – supercat Jun 29 '16 at 21:19
  • @inBetween Then how about this: Class instances behave like heap-allocated objects and struct instances behave like stack-allocated objects. – user253751 Jun 30 '16 at 07:28
  • @immibis Class instances behave like *referecce types* and struct instances behave like *value types*, I'm not sure why you insist in tying behavior with the underlying storage mechanism; that might make sense in C++, in C#, 99.9% of the time its not something you need to worry about. – InBetween Jun 30 '16 at 08:09
15

In C#, class type objects are always allocated on the heap, i.e. variables of such types are always references ("pointers"). Just declaring a variable of such a type does not cause the allocation of an object. Allocating a class object on the stack like it's common to do in C++ isn't (in general) an option in C#.

Local variables of any type that have not been assigned to are considered uninitialized, and they cannot be read until they have been assigned to. This is a design choice (another way would have been to assign default(T) to every variable at declaration time) which seems like a good idea because it should protect you from some programming errors.

It's similar to how in C++ it wouldn't make sense to say SomeClass *object; and never assign anything to it.

Because in C# all class type variables are pointers, allocating an empty object when the variable is declared would lead to inefficient code when you actually only want to assign a value to the variable later, for instance in situations like this:

// Needs to be declared here to be available outside of `try`
Foo f;

try { f = GetFoo(); }
catch (SomeException) { return null; }

f.Bar();

Or

Foo f;

if (bar)
    f = GetFoo();
else
    f = GetDifferentFoo();
Matti Virkkunen
  • 63,558
  • 9
  • 127
  • 159
  • 3
    It's a bit sloppy to say that "class types are references". Class types are class types. What is true is that a *variable declared with class type* represents a nullable reference to an object. – Kerrek SB Jun 29 '16 at 15:06
  • @KerrekSB: Better? I admit talking about the heap is probably considered an implementation detail, but that's how it works in practice. – Matti Virkkunen Jun 30 '16 at 06:50
  • The stack discussion here does not help at all - it is orthogonal to the question. Whether the object was on the stack or the heap is irrelevant as to why `new` is needed. – mjwills May 10 '21 at 05:09
14

ignoring the stack vs heap side of things:

because C# made the bad decision to copy C++ when they should have just made the syntax

Car car = Car()

(or something similar). Having 'new' is superfluous.

zacaj
  • 1,987
  • 1
  • 19
  • 39
  • 4
    +1 for the only answer so far to actually get it right. There is no fundamental need for the `new` keyword in C# the way there is in C++, and there are other CLR languages, in fact, that don't use it (or any language-specific equivalent) at all. – Mason Wheeler Jun 29 '16 at 19:28
  • 5
    Except that of course that now introduces conflicts if you have a method called `Car()` in the scope where you're trying to call the constructor. The `new` part makes it clear that you're trying to call a constructor. I can see an argument for `Car.new()` instead, but just `Car()` seems like a bad idea to me. – Jon Skeet Jun 30 '16 at 06:15
  • Yes, adding `new` allows for disambiguated syntax, between 1) contructors/methods/referencing a class and 2) creating an object. So while it's not necessarily "needed", it goes a long way to distinguishing what you are employing. – vapcguy Jul 05 '16 at 19:51
  • 1
    The code `Car car = Car()` actually works in C++ although it is different from what you get with using `new` in C++. – Marian Spanik Aug 18 '16 at 12:08
  • Or you could simply have something like Car car = Car.(); – pythonic Feb 17 '17 at 22:08
8

When you use referenced types then in this statement

Car c = new Car();

there are created two entities: a reference named c to an object of type Car in the stack and the object of type Car itself in the heap.

If you will just write

Car c;

then you create an uninitialized reference (provided that c is a local variable) that points to nowhere.

In fact it is equivalent to C++ code where instead of references there are used pointers.

For example

Car *c = new Car();

or just

Car *c;

The difference between C++ and C# is that C++ can create instances of classes in the stack like

Car c;

In C# this means creating a reference of type Car that as I said points nowhere.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
2

From the microsoft programming guide:

At run time, when you declare a variable of a reference type, the variable contains the value null until you explicitly create an instance of the object by using the new operator, or assign it an object that has been created elsewhere by using new

A class is a reference type. When an object of the class is created, the variable to which the object is assigned holds only a reference to that memory. When the object reference is assigned to a new variable, the new variable refers to the original object. Changes made through one variable are reflected in the other variable because they both refer to the same data.

A struct is a value type. When a struct is created, the variable to which the struct is assigned holds the struct's actual data. When the struct is assigned to a new variable, it is copied. The new variable and the original variable therefore contain two separate copies of the same data. Changes made to one copy do not affect the other copy.

I think in your C# example your effectively trying to assign values to a null pointer. In c++ translation this would look like:

Car* x = null;
x->i = 2;
x->j = 3;

This would obviously compile but crash.

Community
  • 1
  • 1