5

How is the below code is printing true?

string x = new string(new char[0]);
string y = new string(new char[0]);
Console.WriteLine(object.ReferenceEquals(x,y));

I expected this to print False, because I expected two separate objects to be constructed and then their references compared.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
Kylo Ren
  • 8,551
  • 6
  • 41
  • 66

3 Answers3

5

This is an undocumented (as far as I'm aware) optimization in the CLR. It's very odd, but yes: the new operator is returning the same reference from two calls.

It appears to be implemented in CoreCLR as well on Linux (and even on Mono).

The string constructor is the only example of this that I've seen, although as noted in comments you can provoke it with other constructor overloads.

I'm convinced it's an optimization in the CLR, as the IL is as you'd expect it - and moving the constructor call into a different method doesn't change things either:

using System;

public class Test
{
    static void Main()
    {
        // Declaring as object to avoid using the == overload in string
        object x = CreateString(new char[0]);
        object y = CreateString(new char[0]);
        object z = CreateString(new char[1]);
        Console.WriteLine(x == y); // True
        Console.WriteLine(x == z); // False        
    }

    static string CreateString(char[] c)
    {
        return new string(c);
    }
}

Now that the CLR is open source, we can find out where this is performed. It appears to be in object.cpp - if you search for occurrences of GetEmptyString you'll see it used in various cases when a string of length 0 is being constructed.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
5

This happens because special case is made for constructing empty strings from empty char arrays. The string constructor returns string.Empty for empty strings constructed in this way:

string x = new string(new char[0]);
string y = new string(new char[0]);
Console.WriteLine(object.ReferenceEquals(x, y)); // true
Console.WriteLine(object.ReferenceEquals(x, string.Empty)); // true

From the reference source for string (this is the constructor for a char* parameter):

[System.Security.SecurityCritical]  // auto-generated
private unsafe String CtorCharPtr(char *ptr)
{
    if (ptr == null)
        return String.Empty;

#if !FEATURE_PAL
    if (ptr < (char*)64000)
        throw new ArgumentException(Environment.GetResourceString("Arg_MustBeStringPtrNotAtom"));
#endif // FEATURE_PAL

    Contract.Assert(this == null, "this == null");        // this is the string constructor, we allocate it

    try {
        int count = wcslen(ptr);
        if (count == 0)
            return String.Empty;

        String result = FastAllocateString(count);
        fixed (char *dest = result)
            wstrcpy(dest, ptr, count);
        return result;
    }
    catch (NullReferenceException) {
        throw new ArgumentOutOfRangeException("ptr", Environment.GetResourceString("ArgumentOutOfRange_PartialWCHAR"));
    }
}

And also (this is the constructor for a char[] parameter):

    [System.Security.SecuritySafeCritical]  // auto-generated
    private String CtorCharArray(char [] value)
    {
        if (value != null && value.Length != 0) {
            String result = FastAllocateString(value.Length);

            unsafe {
                fixed (char * dest = result, source = value) {
                    wstrcpy(dest, source, value.Length);
                }
            }
            return result;
        }
        else
            return String.Empty;
    }

Note the lines:

        if (count == 0)
            return String.Empty;

and

        else
            return String.Empty;
Matthew Watson
  • 104,400
  • 10
  • 158
  • 276
1

That is because object.Equals first checks on reference equality, and then calls Equals on the first variable (x).

string.Equals checks against the actual value of a string (using the current culture settings, which may influence the comparison), not only the reference, so it returns true since both objects have the same value.


For your edit: it seems that the CLRr does some magic and tries to evaluate your new string(char[0]), so it can be interned. You can see the same behavior if you set x to "".

Patrick Hofman
  • 153,850
  • 22
  • 249
  • 325