For instance, does the compiler know to translate
string s = "test " + "this " + "function";
to
string s = "test this function";
and thus avoid the performance hit with the string concatenation?
For instance, does the compiler know to translate
string s = "test " + "this " + "function";
to
string s = "test this function";
and thus avoid the performance hit with the string concatenation?
Yes. This is guaranteed by the C# specification. It's in section 7.18 (of the C# 3.0 spec):
Whenever an expression fulfills the requirements listed above, the expression is evaluated at compile-time. This is true even if the expression is a sub-expression of a larger expression that contains non-constant constructs.
(The "requirements listed above" include the + operator applied to two constant expressions.)
See also this question.
Just a side note on a related subject - the C# compiler will also 'optimize' multiple concatenations involving non-literals using the '+
' operator to a single call to a multi-parameter overload of the String.Concat() method.
So
string result = x + y + z;
compiles to something equivalent to
string result = String.Concat( x, y, z);
rather than the more naive possibility:
string result = String.Concat( String.Concat( x, y), z);
Nothing earth-shattering, but just wanted to add this bit to the discussion about string literal concatenation optimization. I don't know whether this behavior is mandated by the language standard or not.
Yes.
C# not only optimizes the concatenation of string literals, it also collapses equivalent string literals into constants and uses pointers to reference all references to the same constant.
Yes - You can see this explicitly using ILDASM.
Example:
Here's a program that is similar to your example followed by the compiled CIL code:
Note: I am using the String.Concat() function just to see how the compiler treats the two different methods of concatenation.
Program
class Program
{
static void Main(string[] args)
{
string s = "test " + "this " + "function";
string ss = String.Concat("test", "this", "function");
}
}
ILDASM
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 29 (0x1d)
.maxstack 3
.locals init (string V_0,
string V_1)
IL_0000: nop
IL_0001: ldstr "test this function"
IL_0006: stloc.0
IL_0007: ldstr "test"
IL_000c: ldstr "this"
IL_0011: ldstr "function"
IL_0016: call string [mscorlib]System.String::Concat(string,
string,
string)
IL_001b: stloc.1
IL_001c: ret
} // end of method Program::Main
Notice how at IL_0001 the compiler created the constant "test this function" as opposed to how the compiler treats the String.Concat() function - which creates a constant for each of the .Concat() params, then calls the .Concat() function.
From the horses mouth:
Concatenation is the process of appending one string to the end of another string. When you concatenate string literals or string constants by using the + operator, the compiler creates a single string. No run time concatenation occurs. However, string variables can be concatenated only at run time. In this case, you should understand the performance implications of the various approaches.
I had a similar question, but about VB.NET instead of C#. The simplest way of verifying this was to view the compiled assembly under Reflector.
The answer was that both the C# and VB.NET compiler optimise concatenation of string literals.
I believe the answer to that is yes, but you'd have to look at what the compiler spits out ... just compile, and use reflector on it :-)