53

I have this code that checks the references of two variables, I came across this case which is a bit confusing :

string first = "10";
object second = 10.ToString();
dynamic third = second;

Console.WriteLine($"{first == second}   {first == third}");

The result is : False True

My first question is why are the first and third references equal? If the third variable is equal to the second it should be False because their object references aren't equal.

And I got confused when I changed the values to "1" like below:

string first = "1";
object second = 1.ToString();
dynamic third = second;

Console.WriteLine($"{first == second}   {first == third}");

Then the result becomes: True True

Why does this happen?

Enigmativity
  • 113,464
  • 11
  • 89
  • 172
Reza Ariyan
  • 721
  • 3
  • 13

2 Answers2

36

I am not sure why it changes when you change it from 10 to 1

I believe this is an implementation detail and you should not rely on it (will try to find something in the specs) but some positive single digit numbers are cached in int.ToString implementation for .NET Core. Here is excerpt from UInt32ToDecStr which is called internally by int.ToString:

// For single-digit values that are very common, especially 0 and 1, just return cached strings.
if (bufferLength == 1)
{
    return s_singleDigitStringCache[value];
}

As for equality - please check:

  1. C# difference between == and Equals().
  2. String interning in .Net Framework. (compiler will intern string literals, so all of them will point to the same address in memory)
  3. Using type dynamic

UPD:

Was not able to find anything in specs, but next code behaves differently in .NET Framework and .NET 6 (former one prints 11 times False and the latter prints 10 times True and one False):

var dict = new Dictionary<int, string>()
{
    {0, "0"},
    {1, "1"},
    {2, "2"},
    {3, "3"},
    {4, "4"},
    {5, "5"},
    {6, "6"},
    {7, "7"},
    {8, "8"},
    {9, "9"},
    {10, "10"},
};

foreach(var kvp in dict)
{
    Console.WriteLine(object.ReferenceEquals(kvp.Key.ToString(), kvp.Value));
}

UPD2:

The caching was introduced for performance reasons by this PR and is mentioned in Performance Improvements in .NET Core 3.0 blogpost:

In some sizeable web applications, we found that a large number of strings on the managed heap were simple integral values like “0” and “1”. And since the fastest code is code you don’t need to execute at all, why bother allocating and formatting these small numbers over and over when we can instead just cache and reuse the results (effectively our own string interning pool)? That’s what PR dotnet/coreclr#18383 does, creating a small, specialized cache of the strings for “0” through “9”, and any time we now find ourselves formatting a single-digit integer primitive, we instead just grab the relevant string from this cache.

private int _digit = 4;

[Benchmark]
public string SingleDigitToString() => _digit.ToString();
Method Toolchain Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
SingleDigitToString netcoreapp2.1 17.72 ns 0.3273 ns 0.3061 ns 1.00 0.0152 32 B
SingleDigitToString netcoreapp3.0 11.57 ns 0.1750 ns 0.1551 ns 0.65
Guru Stron
  • 102,774
  • 10
  • 95
  • 132
  • 4
    Wow, Its really cache problem! Never thought they would cache them for some reason. – eocron Apr 23 '22 at 15:24
  • 2
    @eocron I believe it is done for performance reasons. AFAIK as I know something similar is done in Java also. – Guru Stron Apr 23 '22 at 15:31
  • 1
    @eocron Java does something similar (since Java 5), but for all values from -128 to 127. So the concept is not unique to C#. – Polygnome Apr 24 '22 at 09:42
  • @Polygnome Python does that for numbers. What is suprising is to do this for strings – Noone AtAll Apr 24 '22 at 16:36
  • 1
    @NooneAtAll It would make absolutely no sense to cache numbers in C# because they are not references to the actual number on the heap, they are directly the numbers themselves that get copied around. Caching strings on the other hand makes sense as they are references. – Petrusion Apr 24 '22 at 18:45
9

The answer to the first question is because string equality isn't based on the object references, as reference types are by default.

first and third are both type string, even if only known at runtime, so the System.String's operator == override is called and:

...in turn, calls the static Equals(String, String) method, which performs an ordinal (case-sensitive and culture-insensitive) comparison.

(source)

I'll also point out that Visual Studio provides a CS0253 compiler warning at first == second:

Possible unintended reference comparison; to get a value comparison, cast the right hand side to type 'string'

As for the second question... See @GuruStron's answer.

rfmodulator
  • 3,638
  • 3
  • 18
  • 22