2

I'm trying to understand string interning. Not for any real purpose other than learning.

Here's where I'm at:

Strings are immutable and a reference type. Its this immutability that allows us to do string interning.

Without string interning, the two strings will be two strings on the heap.

e.g.

private static void Main()
{
   var a = "foo";
   var b = "foo";
   ReferenceEquals(a, b); // would expect this to be false...
}

I would expect that ReferenceEquals to be false. It isn't though it's true. I thought to make it true I would have to do:

private static void Main()
{
   var a = "foo";
   var b = "foo";
   ReferenceEquals(a, b); // false??

   string.Intern(a);
   string.Intern(b);
   ReferenceEquals(a, b); // true?
}

Since the interning process, as I understand it, looks for the string in a hash table and if its not there it adds it. On further interning it looks for the string and if it finds it, then it changes the reference to point to the same place in the hash table.

This should speed up comparisons? Since it it doesn't need to check if each char matches and can just check if both strings point to the same location. (Let's ignore the overhead of actually interning for now till I understand how this works).

So what am I not getting. Why is the first code block returning true and not false?

Stuart
  • 3,949
  • 7
  • 29
  • 58
  • 4
    All string constants are interned by the compiler. Try `Object.ReferenceEquals("foo", "food".Substring(0, 3))` versus `Object.ReferenceEquals("foo", String.Intern("food".Substring(0, 3)))`. – Jeroen Mostert Aug 31 '17 at 12:02

1 Answers1

2

This occurs because "foo" is interned.

static void Main(string[] args)
{
    var a = "foo";
    var b = "foo";
    Console.WriteLine(string.IsInterned(a));

    Console.WriteLine(ReferenceEquals(a, b));
    Console.ReadLine();
}

The compiler will intern all literals / constants by default.

mjwills
  • 23,389
  • 6
  • 40
  • 63
  • Thanks, I didn't know that literals / constants are interned by default. The book I'm working through gave the example using 2 constants and said the ReferenceEquals should be false! Which threw me! – Stuart Aug 31 '17 at 12:10
  • Yes, that is odd. You should never use `ReferenceEquals` with strings. – mjwills Aug 31 '17 at 12:12
  • 1
    @Stuart: feel free to edit your question to include the name and edition of the book, so future learners know what to avoid. :-P – Jeroen Mostert Aug 31 '17 at 12:12