5

Possible Duplicate:
Does C# optimize the concatenation of string literals?

I just found out that we write a line like this:

string s = "string";
s = s + s; // this translates to s = string.concat("string", "string");

However I opened the string class through reflector and I don't see where this + operator is overloaded? I can see that == and != are overloaded.

[TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
    public static bool operator ==(string a, string b)
    {
      return string.Equals(a, b);
    }
[TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
    public static bool operator !=(string a, string b)
    {
      return !string.Equals(a, b);
    }

So why does concat gets called when we use + for combining strings?

Thanks.

Community
  • 1
  • 1
Varun Sharma
  • 2,591
  • 8
  • 45
  • 63
  • 2
    @MichaelPetrotta it's not a duplicate of that question. The linked question actually concerns constant folding. – phoog Nov 29 '12 at 02:09
  • 2
    @phoog, well, I *kinda* agree with you in that the question doesn't cover that, but Jon's answer answers VVV's question precisely. – Michael Petrotta Nov 29 '12 at 02:11
  • 1
    @MichaelPetrotta, how does that answer by J. Skeet respond to this question? – horgh Nov 29 '12 at 02:56
  • 1
    If two strings were typed as `dynamic`, is the runtime hard-coded to treat the addition as `String.Concat`? – Chris Sinclair Nov 29 '12 at 03:12
  • @KonstantinVasilcov: *"why does concat gets called when we use + for combining strings?"* *"the expression ["test" + test"] is evaluated at compile-time"*. Also, Michael Burr's answer: *"the C# compiler will also 'optimize' multiple concatenations involving non-literals using the '+' operator to a single call to a multi-parameter overload of the String.Concat() method"* – Michael Petrotta Nov 29 '12 at 03:19
  • 1
    @MichaelPetrotta `"test" + "test"` is evaluated at compile time, and the result is a constant string `"testtest"`. How the compiler does this is an implementation detail. On the other hand, assuming `string test = GetMeAString();`, the expression `test + test` is evaluated at run time. The C# spec calls for it to be compiled to `System.String.Concat(test, test)`. Very different. – phoog Nov 29 '12 at 03:33
  • 1
    @ChrisSinclair probably so. Dynamic calls are evaluated using the rules of the C# language, and interpreting the addition operator as a `Concat` call is required by the C# spec. – phoog Nov 29 '12 at 03:35
  • 1
    @phoog: I don't question anything you've written. I don't see how it's relevant, though. VVV has asked why System.String doesn't overload the `+` operator. Both answers I quoted answer that, for literal *and* non-literal strings - the compiler is doing the magic, whether by *literally* concatenating them, or by generating `String.Concat` calls. – Michael Petrotta Nov 29 '12 at 03:42
  • 1
    @MichaelPetrotta Jon's answer speaks about literals (and constants) only. As for Michael Burr's answer, VVV actually asks "why". – horgh Nov 29 '12 at 03:51
  • 1
    @KonstantinVasilcov: and Michael's answer answers that "why": *"the C# compiler will also 'optimize' multiple concatenations..."* At this point, we're going to have to agree to disagree, don't you think? – Michael Petrotta Nov 29 '12 at 03:54
  • 1
    @MichaelPetrotta I guess it would be nice to see some kind of C# compiler specs stating this or smth.... – horgh Nov 29 '12 at 04:01
  • 1
    @KonstantinVasilcov I'm working on that. – phoog Nov 29 '12 at 04:10
  • 1
    @MichaelPetrotta Jon Skeet's answer isn't only relevant to strings. It also addresses the question "why does IL load the constant 17 when I compile "int i = 10 + 7". I'm looking at the specs now; I'll have a more complete answer in a moment. – phoog Nov 29 '12 at 04:12
  • @MichaelPetrotta My point is this: Concatenation of string literals is defined, as Jon Skeet's answer notes, by §7.18 of the spec, which concerns "Constant Expressions". Concatenation of string variables is governed by §7.7.4 ("Addition operator"). In fact, My earlier comment is incorrect: The C# spec only requires the string + operator to concatenate strings; it does not specify how the compiler achieves that. The string type is defined in the CLI spec, and it does not include a + operator, so the compiler writers could either implement their own concatenation routine or call String.Concat. – phoog Nov 29 '12 at 04:23
  • 1
    @phoog: I don't disagree with that at all. It sounded like you were using that point to argue against something that I'd said, and I didn't understand what that was. Maybe you wanted to emphasize that how non-literal string concatenation is achieved is an implementation detail, rather than something present in a spec? – Michael Petrotta Nov 29 '12 at 04:26
  • 2
    @MichaelPetrotta Sorry I've been unclear. I am trying to emphasize the fact that the answer to the present question is "section 7.7.4" but Jon's answer to the supposed duplicate question is "section 7.18". Jon's answer, and the question, are about constant folding, not the definition of the + operator. This question is about the definition of the + operator. The common answer to both questions is, indeed, "the compiler does it", but if we seek the answers to our "why" questions in the specs, then the answers are different and the questions are not duplicates. – phoog Nov 29 '12 at 04:36
  • Thanks Michael Petrotta, Phoog, Konstantin Vasilcov and Chris for your time buddy. Now I know whats going on, would also read the C# specs (7.7.4 and 7.18). Thanks. – Varun Sharma Nov 29 '12 at 22:21

2 Answers2

6

So why does concat gets called when we use + for combining strings?

Section 7.7.4 of the C# specification, "Addition operator", defines a binary addition operator for strings, where the operator returns the concatenation of the operands.

The definition of System.String in the CLI specification includes several Concat overloads, but no + operator. (I don't have a definitive answer explaining that omission, but I suppose it's because some languages define operators other than + for string concatenation.)

Given these two facts, the most logical solution for the C# compiler writer is to emit a call to String.Concat when compiling the +(string, string) operator.

phoog
  • 42,068
  • 6
  • 79
  • 117
  • 2
    +1: for all the info...hope someone will give more facts about the issue – horgh Nov 29 '12 at 04:45
  • 1
    @KonstantinVasilcov I recall reading some Eric Lippert material on this question; I don't remember if it was here or on his blog, and I've spent too much time wallowing in the specs to start searching for that now. But I strongly recommend that you have a look yourself. Even if you don't find what you're looking for, you're sure to find something good. http://blogs.msdn.com/b/ericlippert and http://stackoverflow.com/users/88656/eric-lippert – phoog Nov 29 '12 at 04:53
  • 1
    Thanks! I do read Eric Lippert's blog)) Not quick enough, however) – horgh Nov 29 '12 at 04:57
5

The code

    public string Foo(string str1, string str2)
    {
        return str1 + str2;
    }

gives the following IL:

IL_0000:  nop
IL_0001:  ldarg.1
IL_0002:  ldarg.2
IL_0003:  call       string [mscorlib]System.String::Concat(string, string)
IL_0008:  stloc.0
IL_0009:  br.s       IL_000b
IL_000b:  ldloc.0
IL_000c:  ret

The compiler (at least, the one in Visual Studio 2010) does this job and there is no + overloading.

Guillaume
  • 1,782
  • 1
  • 25
  • 42
  • 1
    This answer merely confirms the behavior that the OP is asking about; it does not answer the question "why is it done that way?". – phoog Nov 29 '12 at 04:38
  • @phoog As you said yourself in one of your comment. The compiler has to implement a cast in a way or another as there is no overload for the `+` operator. Visual Studio compiler uses `string.Concat`. Why `string.Concat` rather than another implementation? I think it's because it's the simple thing to do but I don't know if the real answer is publicly available. So yes, my answer confirms OP's expectation. In some kind of way, `+` is just syntactic sugar like `{ get; set; }`. – Guillaume Nov 29 '12 at 04:46
  • @phoog However your answer is obviously better as it refers to the specs and contains more "technical background". But it doesn't say neither why `string.Concat` rather than something else nor why `+` is not just simply overloaded. – Guillaume Nov 29 '12 at 04:52
  • 1
    Why `+` is not overloaded: it's hard to find a reason for not doing something. I never say to my boss "today, I refrained from writing a sort method because ...". Since I don't know the reason, I speculated. (I now have a vague recollection that I heard someone discuss that once in a Channel 9 video, but I am not sure about that.) Why `string.Concat`? I have come to the conclusion that it's technically an implementation detail, but it's hard to see how the author of a C# compiler could make any other choice and still be considered sane. – phoog Nov 29 '12 at 04:59