-3

Seems like any C# question must be duplicate, but I could not find it.

Coding in C#:

MyClassType mt = null;
dofunction(mt);
// mt was modified in dofunction to some non null value, but comes back still null

dofunction is something like

public void dofunction(MyClassType mt){
mt="xxxxx";
}

To get the update to be seen in the caller, I have to use the ref keyword. Why? I thought class instances were always passed as ref, without need of the ref keyword. MyClassType tells dofunction that mt is a class instance, but it doesn't act like it. Setting mt=null is not a class instance, I guess, but so what?

Again, the question is "Why?".

I've edited ..........

Rephrase question, which seems different than so called duplicates: If you set mt to null before the call, it acts like you have NOT used the ref keyword. If you set mt to new MyClassType() before the call, it acts as if you have used the ref keyword. Why would C# act in this convoluted way?

Here is example code that demonstrates my premise. I am writing C# in Visual Studio 2019, ASP.Net Core 5.

Here is calling routine:

public IActionResult RefTest() {
            MrShow m1 = null; // in this case doFunction acts as if m1 does NOT have the ref keyword
            MyProject.Utils.TestIt.doFunction(m1);
            string m1contents = "result when m1 set to null: ";
            if (m1 == null) m1contents += "null / ";
            else m1contents += m1.idnmbr + " / ";
            m1contents += "result when m1 set to class instance: ";
            m1 = new MrShow(); // in this case doFunction acts as if m1 does have the ref keyword
            MyProject.Utils.TestIt.doFunction(m1);
            if (m1 == null) m1contents += "null / ";
            else m1contents += m1.idnmbr + " / ";
            ViewBag.m1contents = m1contents;
            return View();
        }

and here is what is called:

static internal void doFunction(MrShow m1) {
            if (m1 != null) {
                m1.idnmbr = "doFunction changed this";
            }
            else {
                m1 = new MrShow();
                m1.idnmbr = "m1 did not change this";
            }
        }

In my web app, I get result:

RefTest Results
result when m1 set to null: null / result when m1 set to class instance: doFunction changed this / 
Bob Koury
  • 273
  • 3
  • 10
  • 1
    In C#, arguments can be passed to parameters either by value or by reference. The argument is passed by value even if the parameter is of type reference. – user3026017 Aug 02 '21 at 14:38
  • 2
    Does this answer your question? [What is the use of “ref” for reference-type variables in C#?](https://stackoverflow.com/q/961717/11683) – GSerg Aug 02 '21 at 14:39
  • @GSerg - Yes, it is a duplicate question. But I like [the answer "walking with the dog"](https://stackoverflow.com/a/68623565/1016343) **Caius Jard** has given. ;-) – Matt Aug 02 '21 at 15:05
  • 1
    *If you set mt to null before the call, it acts like you have NOT used the ref keyword* - it acts like you have not used the `ref` keyword because you have not used the ref keyword. *If you set mt to new MyClassType() before the call, it acts as if you have used the ref keyword* - no, it doesn't – Caius Jard Aug 02 '21 at 19:04
  • Grrrrrrrrrrrr. "no, it doesn't" ???????? – Bob Koury Aug 02 '21 at 19:29
  • Honestly, there is no difference in behavior regardless whether the argument is null or pre-initialized to a value; without `ref` you cannot set the passed-in varaible to a `new` object and make it stick. With or without `ref` you can modify the instance data, unless the argument is null in which case you cannot modify the instance because there is no instance, not because it was/wasn't passef with/out ref. "So long as there is a dog, you can always shave the dog, but you can only swap out the dog if it's a `ref` argument" (or an `out` argument) – Caius Jard Aug 02 '21 at 19:42
  • **You're shaving the dog!** And to make things worse, youre swapping the dog out, shaving the swapped out dog, throwing the swapped shaved dog away and then observing that the original dog is unaltered and trying to tie it to `ref` or null vs notnull - none if it is to do with ref and the nullness is driving the behavior of the dog shaving routine thanks to the ifnull check. You can shave the dog no matter how you pass it – Caius Jard Aug 02 '21 at 20:11
  • You keep asserting that this is different than the duplicate, despite the fact that the duplicate explains exactly the situation you're talking about. There is a difference between changing a variable and changing an object that a variable references. The duplicate explains these differences in depth. If you find one of the answers isn't helpful, consider reading more, as there are numerous answers addressing the problem from different perspectives. – Servy Aug 02 '21 at 20:25
  • Does this answer your question? GSerg -- yes, this is what I'm seeing. I need ref to permit caller to see update when caller passes null. – Bob Koury Aug 03 '21 at 13:53
  • It's nothing to do with passing null though; that's your code that is seeing null and setting it `new`. Yes, you need to use `ref` to make the `new` stick, but it's nothing to do with it being null. **Don't use ref** - it's nearly never used and for good reason; it makes code much harder to follow. `doFunction` should return the new instance it makes. Actually doFunction should probably decide what its role is in life - to make a new thing or modify an existing thing, and it should just do one of them. Perhaps one day, in a few years, you'll look back on this spaghetti code.. – Caius Jard Aug 03 '21 at 15:22
  • ..and better realize what we're saying; ref probably seems like a great way to code now because you just haven't experienced enough awful code to want to avoid it – Caius Jard Aug 03 '21 at 15:22
  • I have many object parameters, unlike my simple example, so it's tedious to have function return them all. – Bob Koury Aug 03 '21 at 19:23
  • Let me see if you'll agree with this statement: When caller sets input parameter to null, without ref keyword, callee gets a copy of the input parameter. So, when callee updates parm to something that doesn't point to null, callee on return knows nothing about it because the copy pointer was updated, not the one the caller has. – Bob Koury Aug 03 '21 at 19:27
  • Got time to more carefully read something you wrote: There are even built in ways to quickly return a pair or more of things without having to make classes just for them: .... That's useful. Thanks. – Bob Koury Aug 03 '21 at 19:41
  • regarding "First, i want to point out that the method actually has logic that makes it behave differently depending on whether the argument is null or not": in case parm was set to null by caller, my example sets passed parm to something in callee but caller doesn't see update. My example changed passed parm whether caller had previously set it to null or not. In the caller set it to null case, caller does not see updates to it by callee. – Bob Koury Aug 03 '21 at 19:47
  • *My example changed passed parm whether caller had previously set it to null or not* - no. You only change the passed in thing if the passed in thing is not null. In the case that you pass a null you first make a new thing, then you change the new thing. The new thing is not the passed-in thing. Then (without ref) you implicitly throw it away when the method ends. See the pictures I drew. First there is one variable referring to the null, then there are two, then the second variable refers to something else, then the something else is changed, then it is thrown away – Caius Jard Aug 03 '21 at 20:08
  • Caller does not see updates by callee, because the thing the caller is holding onto is not the thing that ends up getting changed. **Do not use ref**; either have your caller make a `new MrShow`, that `doFunction` can alter the contents of, or have `doFunction` make and alter a `new MrShow`, that it then `return`s and Caller can capture the `return`ed value. If you have a lot to return, shove it all in parentheses: `return (my1, my2, my3, my4)`, then you can deconstruct it on the caller: `(My1 x1, My2 x2, My3 x3, My4 x4) = dofunction();` rather than `dofunction(ref x1, ref x2, ref x3, ref x4)` – Caius Jard Aug 03 '21 at 20:12
  • Thanks again. Helpful. I think we're done. – Bob Koury Aug 03 '21 at 21:08

1 Answers1

2

This is an often talked about and quite confusing aspect of C#. I think the chief form of confusion arises because we talk about "pass by reference" and "pass by value" and values are copies.. and those terms lead people to think that in some cases an object's data is copied and in other cases the original data is passed.

Reference types (classes) are always "passed by a reference" and when I say this I mean it's a contraction of "passed to a method by sending a reference to the data in as the method argument", but the crucial thing for what the method can do is essentially whether they are "passed by giving the original reference" or "passed by giving a copy of the reference"

The default is "copy"; you make a new class and it's data goes somewhere in memory. As part of making it you created a variable to refer to it. Then you passed it to a method, and by default another independent variable is created that refers to the same data in memory. As such, you can change anything you like about the data, but you could only make the earlier created variable refer to an entire different object if that earlier variable were itself passed. Because by default another variable is created, attached and passed if you make that new variable refer to something else then the earlier variable is not affected. In either "copy" mode or "original" mode you can modify some property of the object

When the C# world says "passed by reference" (original) or "passed by value" (copy) they are talking about what happens to the variable that refers to the data that makes up the object; the variable is either sent original or a copy/additional is sent. They aren't talking about the object's data - there's only one blob of data with a reference type, with N number of variable references to it

I tend to explain it as taking your dog for a walk; there is one dog, just as there is one object in memory. When you call a method it's like letting your friend come along to walk the dog too and when they say "hey, can I lead him for a while?" you choose whether to give that person the original lead you're holding (ref keyword) or alternatively attaching a brand new lead to the same dog (so it has two leads) and giving the new lead to the other person (no keyword). The dog isn't cloned; there is only ever one dog. The lead is a reference to the dog; you hold the lead, not the dog. You steer the dog via the lead. Leads are always attached to the dog, not another lead. There isn't a chain

If that person takes their lead and attaches it to a whole new dog they found roaming around in the park, your lead is still attached to your dog. Their actions don't change which dog your lead is connected to. If you handed over your original lead and they attached it to a new dog, your dog is lost and when control comes back to you, you find out you have a poodle, not an Alsatian.

If there was no newing involved it wouldn't matter whether your friend used their new lead or your original lead to walk the dog to the grooming parlor and have it shaved; in either case they have modified some aspect of your dog and you see it as shaved when you get it back

ref thus purely dictates whether a method can replace a passed-in object with a new one and the calling method will see the change.

Try not to use it; if your method is intended to make new objects it should return them rather than surprising the caller with "hey, I swapped your object for a new one"

Your dofunction should be like:

public MyClassType dofunction(){
  return new MyClassType() { SomeProperty = "xxxxx"};
}

We don't code like "here, have this null thing and set it to an instance of something for me" - we code like "give me a new thing and I will update my null thing if I want to"

MyClassType mt = dofunction();

If you get to a situation where "I have to use ref because I want to return two things" you can still avoid using it - make another class to hold the two things you want to return and return an instance of that. There are even built in ways to quickly return a pair or more of things without having to make classes just for them:

public (MyClassType X, string Y) dofunction(){
  return (new MyClassType() { SomeProperty = "xxxxx"}, "hello this is Y we love c#");
}

The compiler will effectively write this class for you; it uses a ValueTuple behind the scenes

var result = dofunction();
MyClassType mt = result.X;
string secondThing = result.Y;

Edit: ok, so you've posted the experiment that led you to conclude that "there is a difference in behavior depending on if the argument passed is null or not"

First, i want to point out that the method actually has logic that makes it behave differently depending on whether the argument is null or not

This means you're writing code that behaves differently, and observing a result and going "oh! C# is behaving like ref sometimes and not others!"

No, it isn't; C# is being consistent. You're being inconsistent

And you're shaving the dog. I'll illustrate with pictures. I renamed the argument to the function to m1df to help tell things apart:

    //your method
    static internal void doFunction(MrShow m1df) {
        if (m1df != null) {
            m1df.idnmbr = "doFunction changed this";
        }
        else {
            m1df = new MrShow();
            m1df.idnmbr = "m1 did not change this";
        }
    }

    //your code
    MrShow m1 = null; // in this case doFunction acts as if m1 does NOT have the ref keyword
    MyProject.Utils.TestIt.doFunction(m1);

enter image description here

Let's go again, this time with a non null argument:

enter image description here

None of this is anything to do with ref; you don't need ref to make edits to a passed object (setting idnmbr to a string) survive after the method is over. You need ref to make wholesale replacements of the entire object (use of new keyword to instantiate a new instance) survive

=> You can always shave the dog, because passing is always by reference (to the single pile of data that makes up the instance). If passing caused a copy of the entire dog to be created, you could never shave the original dog. Passing is always by reference and the reference is duplicated unless ref is specified..

Caius Jard
  • 72,509
  • 5
  • 49
  • 80
  • It's not true: They are always "passed by reference", – Bob Koury Aug 02 '21 at 17:56
  • It's not true: They are always "passed by reference", -- if you set m1 to null, it is not passed by reference; if you set it to "new MyClassType()", it is passed by reference. My main question is why should it matter what I set it to before I make the call to the method? The way it works makes life more convoluted. Regarding having the method return and MyClassType, I often pass many parameters; when multiple are changed, it's a chore to pass back everything in some new class which conglomerates other classes. – Bob Koury Aug 02 '21 at 18:02
  • *why should it matter what I set it to before I make the call to the method* - it doesn't matter. Whatever experiment you've carried out to come to this conclusion is broken, and the premise is faulty. https://dotnetfiddle.net/xW9C2U – Caius Jard Aug 02 '21 at 19:03
  • Now we are at the crux. I have written code to demo, but I'm not sure how replicate it here. I'm not familiar with your dotnetfiddle ---- let see what's that's about. – Bob Koury Aug 02 '21 at 19:34
  • You can write it into the fiddle (or over the top of it), and hit save or share, and you'll get a link to paste back here – Caius Jard Aug 02 '21 at 19:38
  • Your example seems to say it never acts like it uses the ref keyword. My example says it does sometimes. Is there something different about StringBuilder than my own class? I don't know. Anyway. let me paste something here that will probably format into mess: ------- it's too long to paste – Bob Koury Aug 02 '21 at 19:47
  • Perhaps unsurprisingly, my example never acts like it uses the `ref` keyword **because it doesnt use the ref keyword**. Paste the link to the fiddle; noone has any desire to read code longer than one line in comments. StringBuilder is a convenient and well encountered reference type that is easily modifiable, I'll do another class if you want – Caius Jard Aug 02 '21 at 19:55
  • Mary never survives: https://dotnetfiddle.net/0LTjkX – Caius Jard Aug 02 '21 at 20:00
  • I could not paste here in sequence, so I edited the original question with my pasted example......... if you have time to take a look. Because I'm writing in a web app, I cannot paste the whole project, only what I see as relevant. – Bob Koury Aug 02 '21 at 20:00
  • Sarah always survives https://dotnetfiddle.net/EmY3yO – Caius Jard Aug 02 '21 at 20:04
  • Crucial point: **I never shave the dog** - `ref` is nothing to do with shaving the dog – Caius Jard Aug 02 '21 at 20:04
  • I've updated the answer with regards to your question edits – Caius Jard Aug 02 '21 at 21:14
  • "because passing is always by reference" This is just objectively false. And saying these factually false statements is exactly why people like the OP get confused when the parameter that you claim is passed by reference does not behave as if it's passed by reference. The parameter is only passed by reference when you use a keyword that results in it being passed by reference. Passing a reference type by value and passing a reference type by reference are very different things, and it's important to distinguish between them correctly. – Servy Aug 02 '21 at 21:36
  • Yeah, I think you're making it worse tbh ("pass a reference type by value" and "pass a reference type by reference" are terrible phrases that don't adequately explain how it works and only make sense when you know how it works), but I've updated the answer in response to your comment – Caius Jard Aug 03 '21 at 06:19
  • If you think that the terms are confusing and unhelpful, then why did you *use the terms, but simply use them incorrectly*. If you think the terms aren't helpful, then you could not use them at all, and explain the concepts without using the established industry terms for these concepts, but using the terms *incorrectly* is not going to make things better than using the terms properly. Using terms to mean the opposite of what they actually mean doesn't make things *less* confusing. – Servy Aug 03 '21 at 12:09
  • Thanks very much for explanations. Seems like I need to use ref to be sure caller sees updates by callee. – Bob Koury Aug 03 '21 at 13:56
  • "Updates" is a very vague term. You only need to use `ref` if you want the caller to see **replacement instances** (i.e. `person = new Person()`) by the callee. You do not need `ref` to see **instance modifications** (i.e. `person.Name = "new name"`). I've said it before, in the answer, but I'll say it again **Don't use `ref`** - return a new object instead. "But I have multiple things to return" is not a strong enough reason, especially when an anonymous type or valuetype written by the compiler takes all the pain out of "returning multiple things" – Caius Jard Aug 03 '21 at 15:12
  • @BobKoury An object is a container for more objects and/or primitive values. If you're changing/replacing the values inside the container, you don't need `ref`. You should only use `ref` when you want to replace the container itself (which is quite an unusual requirement, so use with great care) or when you want to pass a primitive value by reference directly. – Clonkex May 01 '23 at 04:25