149

When I pass a string to a function, is a pointer to the string's contents passed, or is the entire string passed to the function on the stack like a struct would be?

Cole Tobin
  • 9,206
  • 15
  • 49
  • 74

3 Answers3

339

A reference is passed; however, it's not technically passed by reference. This is a subtle, but very important distinction. Consider the following code:

void DoSomething(string strLocal)
{
    strLocal = "local";
}
void Main()
{
    string strMain = "main";
    DoSomething(strMain);
    Console.WriteLine(strMain); // What gets printed?
}

There are three things you need to know to understand what happens here:

  1. Strings are reference types in C#.
  2. They are also immutable, so any time you do something that looks like you're changing the string, you aren't. A completely new string gets created, the reference is pointed at it, and the old one gets thrown away.
  3. Even though strings are reference types, strMain isn't passed by reference. It's a reference type, but the reference itself is passed by value. Any time you pass a parameter without the ref keyword (not counting out parameters), you've passed something by value.

So that must mean you're...passing a reference by value. Since it's a reference type, only the reference was copied onto the stack. But what does that mean?

Passing reference types by value: You're already doing it

C# variables are either reference types or value types. C# parameters are either passed by reference or passed by value. Terminology is a problem here; these sound like the same thing, but they're not.

If you pass a parameter of ANY type, and you don't use the ref keyword, then you've passed it by value. If you've passed it by value, what you really passed was a copy. But if the parameter was a reference type, then the thing you copied was the reference, not whatever it was pointing at.

Here's the first line of the Main method:

string strMain = "main";

We've created two things on this line: a string with the value main stored off in memory somewhere, and a reference variable called strMain pointing to it.

DoSomething(strMain);

Now we pass that reference to DoSomething. We've passed it by value, so that means we made a copy. It's a reference type, so that means we copied the reference, not the string itself. Now we have two references that each point to the same value in memory.

Inside the callee

Here's the top of the DoSomething method:

void DoSomething(string strLocal)

No ref keyword, so strLocal and strMain are two different references pointing at the same value. If we reassign strLocal...

strLocal = "local";   

...we haven't changed the stored value; we took the reference called strLocal and aimed it at a brand new string. What happens to strMain when we do that? Nothing. It's still pointing at the old string.

string strMain = "main";    // Store a string, create a reference to it
DoSomething(strMain);       // Reference gets copied, copy gets re-pointed
Console.WriteLine(strMain); // The original string is still "main" 

Immutability

Let's change the scenario for a second. Imagine we aren't working with strings, but some mutable reference type, like a class you've created.

class MutableThing
{
    public int ChangeMe { get; set; }
}

If you follow the reference objLocal to the object it points to, you can change its properties:

void DoSomething(MutableThing objLocal)
{
     objLocal.ChangeMe = 0;
} 

There's still only one MutableThing in memory, and both the copied reference and the original reference still point to it. The properties of the MutableThing itself have changed:

void Main()
{
    var objMain = new MutableThing();
    objMain.ChangeMe = 5; 
    Console.WriteLine(objMain.ChangeMe); // it's 5 on objMain

    DoSomething(objMain);                // now it's 0 on objLocal
    Console.WriteLine(objMain.ChangeMe); // it's also 0 on objMain   
}

Ah, but strings are immutable! There's no ChangeMe property to set. You can't do strLocal[3] = 'H' in C# like you could with a C-style char array; you have to construct a whole new string instead. The only way to change strLocal is to point the reference at another string, and that means nothing you do to strLocal can affect strMain. The value is immutable, and the reference is a copy.

Passing a reference by reference

To prove there's a difference, here's what happens when you pass a reference by reference:

void DoSomethingByReference(ref string strLocal)
{
    strLocal = "local";
}
void Main()
{
    string strMain = "main";
    DoSomethingByReference(ref strMain);
    Console.WriteLine(strMain);          // Prints "local"
}

This time, the string in Main really does get changed because you passed the reference without copying it on the stack.

So even though strings are reference types, passing them by value means whatever goes on in the callee won't affect the string in the caller. But since they are reference types, you don't have to copy the entire string in memory when you want to pass it around.

Further resources:

Justin Morgan - On strike
  • 30,035
  • 12
  • 80
  • 104
  • 3
    @TheLight - Sorry, but you're incorrect here when you say: "A reference type is passed by reference by default." By default, all parameters are passed by value, but with reference types, this means that *the reference is passed by value.* You're conflating reference types with reference parameters, which is understandable because it's a very confusing distinction. See the [Passing Reference Types by Value section here.](http://msdn.microsoft.com/en-us/library/s6938f28(v=vs.110).aspx) Your linked article is quite correct, but it actually supports my point. – Justin Morgan - On strike Feb 15 '13 at 16:44
  • 1
    @JustinMorgan Not to bring up a a dead comment thread, but I think TheLight's comment makes sense if you think in C. In C, data is just a block of memory. A reference is a pointer to that block of memory. If you pass the entire block of memory to a function, that's called "passing by value". If you pass the pointer it's called "passing by reference". In C#, there is no notion of passing in the entire block of memory, so they redefined "passing by value" to mean passing the pointer in. That seems wrong, but a pointer is just a block of memory too! To me, the terminology is pretty arbitrary – rliu Jul 01 '13 at 21:29
  • @roliu - The problem is that we're not working in C, and C# is extremely different despite its similar name and syntax. For one thing, [references are not the same as pointers](https://blogs.msdn.com/b/ericlippert/archive/2009/02/17/references-are-not-addresses.aspx), and thinking of them that way can lead to pitfalls. The biggest problem, though, is that "passing by reference" has a **very specific** meaning in C#, requiring the `ref` keyword. To prove that passing by reference makes a difference, see this demo: http://rextester.com/WKBG5978 – Justin Morgan - On strike Jul 02 '13 at 18:01
  • 2
    @JustinMorgan I agree that mixing C and C# terminology is bad, but, while I enjoyed lippert's post, I don't agree that thinking of references as pointers particularly fogs up anything here. The blog post describes how thinking of a reference as a pointer gives it too much power. I'm aware that the `ref` keyword has utility, I was just trying to explain why one _might_ think of passing a reference type by value in C# seems like the "traditional" (i.e. C) notion of passing by reference (and passing a reference type by reference in C# seems more like passing a reference to a reference by value). – rliu Jul 02 '13 at 18:21
  • @roliu - You're right, they have similarities, and it's hard to avoid the comparison when coming (as many of us did) from C/C++ to C#. I think we agree on most things except the importance of terminology. For less confusion, we could talk about [Call-by-Sharing](http://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_sharing) (C#) vs [Call-by-Reference](http://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_reference) (C). Actually, this question is a perfect example of why it's important: With call-by-reference semantics, the original string *would* be changed by operations in the callee. – Justin Morgan - On strike Jul 10 '13 at 21:12
  • @adamnationx, if you read this - I saw your suggested edit and fixed the typo you found. Great catch, thank you. – Justin Morgan - On strike Jun 11 '14 at 17:07
  • 3
    You are correct, but I think @roliu was referencing how a function such as `Foo(string bar)` could be thought of as `Foo(char* bar)` whereas `Foo(ref string bar)` would be `Foo(char** bar)` (or `Foo(char*& bar)` or `Foo(string& bar)` in C++). Sure, it's not how you should think of it everyday, but it actually helped me finally understand what is happening under the hood. – Cole Tobin Oct 12 '14 at 02:01
  • Actually I see no differences in passing as parameter between `string` and any other reference type. I can't find any special in the specification or in Lippert's blog about passing it. As stated by [Lippert](https://ericlippert.com/2010/09/30/the-truth-about-value-types/) , there is 3rd kind of value - references. "We see that references and instances of value types are essentially the same thing as far as their storage is concerned; they go on either the stack, in registers, or the heap depending on whether the storage of the value needs to be short-lived or long-lived." – Кое Кто Aug 25 '21 at 16:54
  • Actually, (learn.microsoft.com](https://learn.microsoft.com/en-us/dotnet/csharp/methods#passing-parameters] implies that "**passing by value**" is just the name for passing without `ref`, `out` etc. – Кое Кто Aug 25 '21 at 18:54
  • @КоеКто - You're correct, there's nothing special about strings when passed as a parameter. I didn't mean to imply otherwise. It's just that strings' immutability makes it easier to reason about them in this case. – Justin Morgan - On strike Oct 13 '21 at 21:34
28

Strings in C# are immutable reference objects. This means that references to them are passed around (by value), and once a string is created, you cannot modify it. Methods that produce modified versions of the string (substrings, trimmed versions, etc.) create modified copies of the original string.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
14

Strings are special cases. Each instance is immutable. When you change the value of a string you are allocating a new string in memory.

So only the reference is passed to your function, but when the string is edited it becomes a new instance and doesn't modify the old instance.

Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • 4
    Strings are *not* a special case in this aspect. It is very easy to create immutable objects which could have the same semantics. (That is, an instance of a type which does not expose a method to mutate it...) –  May 29 '12 at 03:08
  • Strings are special cases - they are effectively immutable reference types that appear to be mutable in that they behave like value types. – Enigmativity May 29 '12 at 03:27
  • `StringBuilder` is a mutable string class that allows fast modification of the string being built without allocating new strings in memory for each modification. – Enigmativity May 29 '12 at 03:28
  • 1
    @Enigmativity By that logic then `Uri` (class) and `Guid` (struct) are also special cases. I do not see how `System.String` acts like a "value type" any more than other immutable types... of either class or struct origins. –  May 29 '12 at 03:36
  • 4
    @pst - Strings have special creation semantics - unlike `Uri` & `Guid` - you can just assign a string-literal value to a string variable. The string appears to be mutable, like an `int` being reassigned, but it's creating an object implicitly - no `new` keyword. – Enigmativity May 29 '12 at 04:17
  • 3
    String is a special case, but that has no relevance to this question. Value type, reference type, whatever type will all act the same in this question. – Kirk Broadhurst May 29 '12 at 04:52
  • The only thing that makes strings a special case is that C# supports writing them as literals, and as @KirkBroadhurst points out, that's not relevant. Everything else, including their "value-type-like" behavior (by which I assume you mean things like `==` comparing them by value) can be easily replicated in user-defined types. I would not describe them as behaving like value types. – Justin Morgan - On strike Jul 02 '13 at 19:47