20

Does the C# compiler or .NET CLR do any clever memory optimisation of string literals/constants? I could swear I'd heard of the concept of "string internalisation" so that in any two bits of code in a program, the literal "this is a string" would actually refer to the same object (presumably safe, what with strings being immutable?). I can't find any useful reference to it on Google though...

Have I heard this wrong? Don't worry - I'm not doing anything horrible in my code with this information, just want to better my understanding of how it works under the covers.

Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
Neil Barnwell
  • 41,080
  • 29
  • 148
  • 220

3 Answers3

21

EDIT: While I strongly suspect the statement below is true for all C# compiler implementations, I'm not sure it's actually guaranteed in the spec. Section 2.4.4.5 of the spec talks about literals referring to the same string instance, but it doesn't mention other constant string expressions. I suspect this is an oversight in the spec - I'll email Mads and Eric about it.


It's not just string literals. It's any string constant. So for example, consider:

public const string X = "X";
public const string Y = "Y";
public const string XY = "XY";

void Foo()
{
    string z = X + Y;
}

The compiler realises that the concatenation here (for z) is between two constant strings, and so the result is also a constant string. Therefore the initial value of z will be the same reference as the value of XY, because they're compile-time constants with the same value.

EDIT: The reply from Mads and Eric suggested that in the Microsoft C# compiler string constants and string literals are usually treated the same way - but that other implementations may differ.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Do two identical string constants in different assemblies point to the same object too? / Does the jitter intern string literals? – CodesInChaos Nov 26 '10 at 15:50
  • @CodeInChaos: I believe that depends on the `CompilationRelaxationsAttribute(CompilationRelaxations.NoStringInterning)` attribute. I wouldn't like to say for sure though. – Jon Skeet Nov 26 '10 at 15:53
  • Hi @JonSkeet, please advice whether interned strings with same content always have the same reference? Does it mean that comparing references of such strings will return true? – Johnny_D Apr 24 '13 at 15:43
  • 1
    @Johnny_D: Yes and yes - guaranteed within the same assembly, at least. Between assemblies it gets trickier, IIRC. – Jon Skeet Apr 24 '13 at 16:06
9

This article explains string interning pretty well. Quote:

.NET has the concept of an "intern pool". It's basically just a set of strings, but it makes sure that every time you reference the same string literal, you get a reference to the same string. This is probably language-dependent, but it's certainly true in C# and VB.NET, and I'd be very surprised to see a language it didn't hold for, as IL makes it very easy to do (probably easier than failing to intern literals). As well as literals being automatically interned, you can intern strings manually with the Intern method, and check whether or not there is already an interned string with the same character sequence in the pool using the IsInterned method. This somewhat unintuitively returns a string rather than a boolean - if an equal string is in the pool, a reference to that string is returned. Otherwise, null is returned. Likewise, the Intern method returns a reference to an interned string - either the string you passed in if was already in the pool, or a newly created interned string, or an equal string which was already in the pool.

Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
  • 1
    Sidenote: Since internend strings aren't freed during the live-time of the AppDomain improper use of intering can cause a memory leak. – CodesInChaos Nov 26 '10 at 15:48
9

Yes it does optimize string literals. One simple example where you can see that:

string s1="A";
string s2="A";
object.ReferenceEquals(s1,s2);  //true
CodesInChaos
  • 106,488
  • 23
  • 218
  • 262