1

Kindly look at the following program:

static void Main()
{
    string s1 = "Hello";
    string s2 = "Hello";
    Console.WriteLine ( ( object ) s1 == ( object ) s2 );
    Console.ReadLine();
}

The output of this snippet is "TRUE". Now my question is:

  1. does string s1 = "HELLO" ; create a new string object? If yes, how does it create a new object without calling the constructor and without using the new operator??

  2. If string s1 = "HELLO", and string s2 = "HELLO" create two objects, then how come the answer is TRUE??

Jenish Rabadiya
  • 6,708
  • 6
  • 33
  • 62
  • 4
    `1)` http://stackoverflow.com/a/3328742/447156 `2)` http://stackoverflow.com/q/21278322/447156 – Soner Gönül Dec 07 '15 at 09:00
  • 1
    I'm very surprised how a very simple question turns into a fight about who explains the issue in the lowest-level possible. String interning, intermediate language. I expect an answer explaining the equality in assembly language....... – Matías Fidemraizer Dec 07 '15 at 09:48
  • While OP will find the most useful answer to himself, I believe that making the assumption that an issue can't be explained using high-level semantics is wrong. – Matías Fidemraizer Dec 07 '15 at 09:50
  • When you use a high level language or framework like C# and .NET, you don't care about how memory is used, how C# compiler translates C# code into IL unless the question itself asks for this details. Isn't enough to tell that any objects is checked for equality based on Object.ReferenceEquals unless Object.Equals is overriden? :D – Matías Fidemraizer Dec 07 '15 at 09:51
  • And I don't shout this here because I've already answered this Q&A the high-level way, but because if OP asks for this, it's because he's still a newbie in the C# arena, and too many details will confuse him and other future readers looking for a simple answer........... – Matías Fidemraizer Dec 07 '15 at 09:53

2 Answers2

4

If you intend to compare object references, it's clearer do it like so:

Console.WriteLine ( object.ReferenceEquals(s1, s2 ));

rather than like this:

Console.WriteLine ( ( object ) s1 == ( object ) s3 ); // false

That said, let's rewrite your code a little:

using System;

public class Program
{
    public static void Main()
    {
        string s1 = "Hello";
        string s2 = string2();
        Console.WriteLine ( object.ReferenceEquals(s1, s2 )); // true

        string s3 = "Hel";
        s3 = s3 + "lo";

        Console.WriteLine ( object.ReferenceEquals(s1, s3 )); // false

        // This is the equivalent of the line above:
        Console.WriteLine ( ( object ) s1 == ( object ) s3 ); // also false

        Console.WriteLine (s1 == s3); // true (comparing string contents)

        s3 = string.Intern(s3);
        Console.WriteLine ( object.ReferenceEquals(s1, s3 )); // now true

        Console.ReadLine();
    }

    private static string string2()
    {
        return "Hello";
    }
}

Ok, so the question is, "Why do the first two strings have the same reference"?

The answer to that is because the compiler keeps a table of all the strings that it has stored so far, and if a new string it encounters is already in that table, it doesn't store a new one; instead, it makes the new string reference the corresponding string that is already in its table. This is called string interning.

The next thing to note is that if you create a new string by concatenating two strings at runtime, then that new string does NOT have the same reference as an existing string. A brand new string is created.

However if you use == to compare that string with another string that has a different reference but the same contents, true will be returned. That's because string == compares the contents of the string.

The following line in the above code demonstrates this:

Console.WriteLine (s1, s3); // true

Finally, note that the runtime can "intern" strings, that is, use a reference to an existing string rather than a new string. However, it does not do this automatically.

You can call string.Intern() to explicitly intern a string, as the code above shows.

Matthew Watson
  • 104,400
  • 10
  • 158
  • 276
  • I'm not sure if string interning is the answer to this question – Matías Fidemraizer Dec 07 '15 at 09:32
  • @MatthewWatson: You say that the expression `( object ) s1 == ( object ) s2` is using `string.Equals` and thus doing a content comparison rather than a reference comparison. I'm not sure that's correct - see the example in the **Remarks** section in the description of `string.Intern` at https://msdn.microsoft.com/en-us/library/system.string.intern(v=vs.110).aspx – Gary McGill Dec 07 '15 at 09:34
  • @downvoter: Care to explain what you think is wrong with this answer? It would be enlightening! – Matthew Watson Dec 07 '15 at 09:54
  • Me! Some minutes ago I've given you a link: https://dotnetfiddle.net/Qst0NX I don't see how upcasting a reference to `object` modifies the equality – Matías Fidemraizer Dec 07 '15 at 09:58
  • @MatíasFidemraizer Well you are wrong, as this link shows: https://dotnetfiddle.net/8F3wDP – Matthew Watson Dec 07 '15 at 09:59
  • See what happens if you call `Equals` instead of `==` even with your latest sample – Matías Fidemraizer Dec 07 '15 at 10:03
  • @MatíasFidemraizer If you use `.Equals()` it will be calling the virtual `Equals()` method, so of course it will return true. But this is beside the point - the OPs code is using `==` and therefore it is doing a reference comparison. – Matthew Watson Dec 07 '15 at 10:07
  • BTW, since I'm a humble person, today I've learnt something new, I didn't know about this behavior with strings (when upcasting to object). Since I'm didn't come across this scenario I didn't know it. – Matías Fidemraizer Dec 07 '15 at 10:17
  • 2
    Perhaps it's worth pointing out that while `string` overloads the `==` operator, operator resolution happens at compile-time, so due to the casting to `object` the overloaded `string.==` operator won't be used, and it's a standard reference comparison instead. – Pieter Witvoet Dec 07 '15 at 10:19
  • @PieterWitvoet This is the main point that I wasn't describing in my own answer (now dropped...). And, don't you find this a language design issue? – Matías Fidemraizer Dec 07 '15 at 11:29
2

does string s1 = "HELLO" ; create a new string object? If yes, how does it create a new object without calling the constructor and without using the new operator??

Yes, not only does it create a new string but also bakes it into the libraries metadata under the "User Strings" section (This is otherwise called "string interning"), so it can directly pull it from there at run-time and save the allocation time. You can view it using ILDASM:

User Strings
-------------------------------------------------------
70000001 : ( 5) L"Hello"

And also see the compiler recognize it as a StringLiteralToken when it parses the syntax tree:

LINQPad Roslyn Visualizer

The compiler is aware of the special syntax given for strings and allows you the special syntactic sugar.

If string s1 = "HELLO", and string s2 = "HELLO" create two objects, then how come the answer is TRUE??

As I previously said in the first part, the string literal is actually only loaded at run-time. This means that string will be loaded once, cached and compared against itself, thus leading this reference equality check to yield true.

You can see this in the emitted IL (Compiled in Release mode):

IL_0000:  ldstr       "Hello"
IL_0005:  ldstr       "Hello"
IL_000A:  stloc.0     // s2
IL_000B:  ldloc.0     // s2
IL_000C:  ceq    
Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321