1

I've been experiencing some peculiar behavior in my C# code that I'm struggling to understand. I'm hoping someone might be able to shed some light on it.

I've created a simple class and list of objects from that class where a string-object comparison is being performed. However, the output seems to suggest that a string-object comparison behaves differently depending on whether the object was directly assigned, cloned, or deserialized from a JSON string.

Here is a simplified example:

using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;

namespace TestJsonDeser
{
    internal class Program
    {
        public class Test
        {
            public object Name { get; set; }
            public object Value { get; set; }
        }
        static void Main(string[] args)
        {
            Test test0 = new Test()
            {
                Name = "test",
                Value = 1
            };

            var ts = JsonConvert.SerializeObject(test0);
            var test1 = JsonConvert.DeserializeObject<Test>(ts);

            var test2 = Clone(test0);
            var test3 = Clone(test1);
            var test4 = CreateTest();
            var test5 = Clone(test4);

            List<Test> list = new List<Test>() { test0, test1, test2, test3, test4, test5 };
            var r3 = list.Where(a=>a.Name=="test").ToList();
            Console.WriteLine($"Count:{r3.Count}");
            foreach (var r in r3) Console.WriteLine($"Idx:{list.IndexOf(r)}");
            Console.ReadKey();
        }

        public static T Clone<T>(T source)
        where T : class, new()
        {
            if (source == null)
                return null;
            var tp = typeof(T);
            var ret = Activator.CreateInstance(typeof(T));
            foreach (var prop in tp.GetProperties(BindingFlags.Public | BindingFlags.Instance).Where(a => a.CanWrite && a.CanRead))
            {
                var value = prop.GetValue(source, null);
                if (value is string str)
                    prop.GetSetMethod().Invoke(ret, new object[] { str.Clone() });
                if (value is long ln)
                    prop.GetSetMethod().Invoke(ret, new object[] { ln });
            }
            return (T)ret;
        }
        static Test CreateTest()
        {
            var test = Activator.CreateInstance(typeof(Test));
            typeof(Test).GetProperty("Name").SetValue(test, "test", null);
            typeof(Test).GetProperty("Value").SetValue(test, 1, null);
            return (Test)test;
        }
    }
}

The output is:

Count:4
Idx:0
Idx:2
Idx:4
Idx:5

I would expect that all the Test objects would have the Name property equal to "test", but it seems that only the ones that were directly assigned, cloned from the directly assigned, or created with Activator.CreateInstance are considered as having Name equal to "test".

The Test objects that were deserialized from a JSON string or cloned from the deserialized do not match, even though printing the Name property clearly shows the value is "test".

Could anyone help explain why this is happening? Is this a bug in LINQ or Newtonsoft's JSON deserialization, or is it some language feature that I'm not aware of?

Any help would be appreciated.

Cirrosi
  • 13
  • 3
  • What happens if you check using `string.Equals()`? I think your issue may be caused by the difference in behavior between `==` and `string.Equals()` as described here: https://stackoverflow.com/questions/1659097/why-would-you-use-string-equals-over – akseli May 23 '23 at 20:46
  • Hi akseli], Thanks for your suggestion. However, the goal of my question isn't to find a workaround, but to understand why there's a discrepancy in behavior between a Test object created directly in code and a Test object deserialized from a JSON string. Both objects seem identical (i.e., the Name property of both objects is "test") but they are treated differently by LINQ's comparison operator. I'm curious as to why this is happening, whether it's an issue with LINQ, Newtonsoft's JSON deserialization, or a C# language feature I'm unaware of. I appreciate any insights. Thanks! – Cirrosi May 23 '23 at 21:05
  • One thing to note is that `String.Clone()` does not clone anything, it just returns the same reference, so you are not creating a new `string` instance in your `Clone` method. You could use the obsolete method `String.Copy(string)` if you wanted to create a new instance. – NetMage May 23 '23 at 22:02
  • Hi NetMage], Thank you for your insight! It's indeed surprising that string.Clone() just returns a reference to the same string, especially given its name. I appreciate your contribution to this discussion. Thanks again! – Cirrosi May 24 '23 at 06:29

2 Answers2

1

The compiler automatically creates one string for all duplicate constant strings, so every reference to "test" is to the same object: in the construction of test0, in the CreateTest method and in the reference based Where test. So that explains why test0 and test4 match.

When you call your Clone method to clone the objects, you special handle string by calling String.Clone(), which just returns the same string reference. So that explains why your clones of test0 and test4, test2 and test5 match.

Fundamentally, the JSON deserializer copies the string from the JSON input string character by character into a char[] and then converts that to a string when the value is requested using new string(). Basically, it is something like if you replaced in your Clone method for strings, new string(str.ToArray()).

Essentially that creates a new instance of string that happens to have the same value.

So that explains why test1 doesn't match. And test3 doesn't match because again, your Clone method doesn't create a new string instance, just returns the same one.

It is very important to note that conceptually String in .Net are immutable (can't be changed) and unique. In the current implementation of .Net String, deduplication is not done to prevent creating a instances with the same value, but that may not always be true. You should never use reference equality for String. See this Microsoft article for more on the complications of String comparisons.

I think it is an unfortunate wart in C#'s design that it works pretty hard to make String seem like it is a value type, and then falls down in the implementation of == not being the same as String.Equals. The fact is, String is a class type, and behaves in the case of comparisons and equality just like any other user-defined class.

NetMage
  • 26,163
  • 3
  • 34
  • 55
  • Hello NetMage, Thank you for your insightful response, it was a real eye-opener. The crucial information that I was missing, and you pointed out, is how the C# compiler treats identical strings. It creates only one instance of such string literals and all subsequent uses of the same literal point to the same memory location. I appreciate your time and expertise in providing this clarification. This discussion has significantly deepened my understanding of string handling in .NET. – Cirrosi May 24 '23 at 06:24
0

It is likely due to JSON.NET deserializing a JSON string to a 'JObject' if it doesnt have a specific type to deserialize to. Using a '==' operator will compare the reference types of the two objects. A JObject that has a value of "test" is not the same reference as a string object with a value of "test".

You could alter the types used when deserializing with your "Test" class or use ToString() if you are confident in the value type.

Jordan Brobyn
  • 133
  • 1
  • 1
  • 7
  • Hi Jordan, Thanks for your input. However, when deserializing to a specific type (JsonConvert.DeserializeObject(ts)), Newtonsoft.JSON doesn't create a JObject, but instances of the specified type. So Name should be a string, not a JObject. The intriguing part is that even a cloned object from the deserialized one, with properties set as string and long, doesn't match the query. This mystery is what I'm seeking to solve. Thanks again for your insights! – Cirrosi May 23 '23 at 21:25
  • 1
    If it is creating a string instead of a JObject, then it is still becoming a new object reference. The '==' operator will compare these references instead of the string contents. The comparison is false for the JSON-deserialized 'Test', but likely 'true' for the directly assigned 'Test', because in the latter case both sides of the '==' are the same object. To get the behavior you want, you can cast to a string. ``` var r3 = list.Where(a => (string)a.Name == "test").ToList(); ``` – Jordan Brobyn May 23 '23 at 21:59