If classes hold an object reference as their value, why doesn't the "new" keyword overwrite them?

Question

I'm struggling to understand all of the implications of pass-by-reference vs pass=by=value.

I understand that in C#, unless explicitly stated, you always pass variables by value. However, since non-primitive types hold references as their values, you're technically passing those references around. So, this is why if I had a class Book with a Name property. I could so something like

Book book1 = new Book("Fight club");
ChangeBookName(book1, "The Wolfman");

void ChangeBookName(Book book, string name){
    book.Name = name;
}

Then doing a Console.WriteLine(book1.name) would output "The Wolfman", because even though there was a pass by value, the value was a reference of the object's location in memory, so changing that also changed the original object.

However, if I do something like

Book book1 = new Book("Fight club");
    ChangeBookName(book1, "The Wolfman");
    
    void ChangeBookName(Book book, string name){
        book = new Book(name);
    }

Then book1.Name won't actually be "The Wolfman", it'll still be "Fight club".

What is happening here behind the scenes? Is the new keyword creating a new object reference? But then, what is happening to the original value that was passed? Why is the new instance of Book not overwriting the old one?

[This question](https://stackoverflow.com/questions/9224517/what-are-classes-references-and-objects) is about Java, but the main points mentioned in it can be applied in C# too. That's just how similar the two languages are. — Sweeper, Aug 30 '20 at 04:56

score 2 · Answer 1 · answered Aug 30 '20 at 04:57

Book book1 = new Book("Fight club");          // <-- book1 holds a ref to 'Fight Club'
ChangeBookName(book1, "The Wolfman");         // <-- a copy of that ref is passed as an argument
// ...                                        // <-- book1 still holds the original ref to 'Fight Club'
    
void ChangeBookName(Book book, string name){  // <-- receives the copy of the ref to 'Fight Club'
    book = new Book(name);                    // <-- overwrites it with a ref to 'The Wolfman'
}                                             // <-- lifetime of the temp copy ends here
                                              // <-- 'The Wolfman` object becomes eligible for gc

Caius Jard · Accepted Answer · 2020-08-30T07:53:38.587

It's probably easier to understand if you forget all that "pass by reference" vs "pass by value" - they're poor terms of phrase, as you're finding out because they tend to make you conceive that the Book memory data is either being copied when it is passed to the method, or the original is passed. "Pass by copy of reference" and "pass by original reference" might be better - class instances are always passed by reference

I find it more helpful to think of nearly every variable in a program as being a reference in its own right, and "pass by value/reference" refers to whether a new reference is created or not when calling a method.

So you have your line:

Book book1 = new Book("Fight club");

Immediately after we execute this line, there is one variable name in your program, book1, and it refers to some block of data at memory address 0x1234 that contains "Fight club"

ChangeBookName(book1, "The Wolfman");

We call the ChangeBookName method and c# establishes another reference, called book, because that is what it says in the method signature, also pointing to address 0x1234.

You have two references, one block of data. The book reference will be lost when the method ends- it's lifetime is only between the { } of the method

If you use this additional book reference to change something about the data:

book.Name = "The wolfman";

Then the first reference, book1 will see the change- it points to the same data, the data changes.

If you point this additional book reference to a whole new block of data elsewhere in memory:

 book = new Book("The wolfman");

You now have two references, two blocks of data - book1 points to "fight club" at 0x1234, and book points to "the wolfman" at 0x2345. The wolfman data and the book reference will be lost when the method ends

The crucial point here about having two references to one block of data is that you can change some property of the data and both references see it? But if you point one of the references to a new block of data the original reference remains pointing to the original data

If you want a method to be able to swap out the block of data for a whole new block of data and also have the original reference experience the change, you use the ref keyword. Conceptually this causes C# not to make a copy of the reference at all, but reuse the same reference (albeit with a different name)

void ChangeBookForANewOne(ref Book tochange){
  tochange = new Book("Needful things");
}

Book b = new Book("Fight club");
ChangeBookForANewOne(b);

All through this code there is only one reference to one block of data. Changing the block of data for a new one inside the method causes the change to be remembered when the method exits

We seldom do ref; if you want to change your book for a new one you should really return it from the method and change reference b to be the newly returned book. People use ref when they want to return more than one thing from a method but really that's an indicator that you should be using a different class as the return type

The same notions are true for value types (usually primitive things like int) but the slight difference is that if you pass an int to a method then you end up with two variable names but also two ints in memory; if you increment the int inside the method the original int doesn't change because the additional variable established for the lifetime of the method call is a different data in memory - the data really is copied and you have two variables and two numbers in memory. Ref style behaviour is more useful and more common for things like this, with things like int.TryParse - it returns a true or false indicating whether parsing succeeded but in order to return the parsed value to you it needs to use the original variable you passed in, not a copy of it.

To do this, TryParse uses a variation of ref called out - a marker on a method variable that indicates "this method will definitely assign a value to the variable you pass in; if you were to give it a variable already initialized to a value it would definitely be overwritten". In contrast, ref indicates "you can pass a variable in that is initialized to a value, I might use the value and I might overwrite it/point it to a new data in memory". If you have a method that doesn't need to take a value but definitely overwrites, like my ChangeForANewBook before, you should really use out - in 100% of cases ChnageForANewBook overwrites what was passed in, which could cause unintended data loss for a developer. Marking it as out would mean C# would make sure only blank references are used/passed in, helping prevent unintended data loss

Good answer! If you never worked with pointers and (de)referencing it can be really confusing. I agree that a better solution would be to return the things that are needed instead of do things by ref. But if you are giving back some simple stuff i would choose a tuple instead of a whole class (e.g. a string and an int). — sunriax, Aug 30 '20 at 07:36
Yes, I was going to add a footnote about tuples; they are in essence "temporary classes the compiler helps you create" so that golden rule "upgrade the class you return rather than using ref" does still work, it's just harder to see through the syntactic sugar that makes using tuples nice :) — Caius Jard, Aug 30 '20 at 07:45

score 0 · Answer 3 · answered Aug 30 '20 at 05:01

0

So, as you said, reference will passed by value and "new" will return reference to new object, but it will overwrite a local copy of old reference. To avoid it you must pass book with keyword "ref" and then it be like "reference to reference of book".

answered Aug 30 '20 at 05:01

Batangaming

201
1
9

sunriax · Answer 4 · 2020-08-30T05:57:16.953

If you want that the orginal book is replaced with a newly created you need the ref or out keyword:

void ChangeBookName(ref Book book, string name)
{
    book = new Book(name);
}

// ...
Book book1 = new Book("Fight club");
ChangeBookName(ref book1, "The Wolfman");

This means that the reference to original book is replaced by the newly created book and the orginal book is marked as obsolete.

Maybe this helps...

If classes hold an object reference as their value, why doesn't the "new" keyword overwrite them?

4 Answers4