18

I just came across something pretty weird to me : when you use the Equals() method on a value type (and if this method has not been overriden, of course) you get something very very slow -- fields are compared one to one using reflection ! As in :

public struct MyStruct{
   int i;
}

   (...)

   MyStruct s, t;
   s.i = 0;
   t.i = 1;
   if ( s.Equals( t ))   /*  s.i will be compared to t.i via reflection here. */
      (...)

My question : why does the C# compiler do not generate a simple method to compare value types ? Something like (in MyStruct's definition) :

   public override bool Equals( Object o ){
      if ( this.i == o.i )
         return true;
      else
         return false;
   }

The compiler knows what are the fields of MyStruct at compile time, why does it wait until runtime to enumerate MyStruct fields ?

Very strange to me.

Thanks :)

ADDED : Sorry, I just realize that, of course, Equals is not a language keyword but a runtime method... The compiler is completely unaware of this method. So it make sens here to use reflection.

David Klempfner
  • 8,700
  • 20
  • 73
  • 153
Sylvain Rodrigue
  • 4,751
  • 5
  • 53
  • 67
  • 1
    "To use the standard implementation of Equals, your value type must be boxed and passed as an instance of the reference type System.ValueType. The Equals method then uses reflection to perform the comparison." - msdn.microsoft.com/en-us/library/ff647790.aspx – Ludington Feb 28 '14 at 02:09

3 Answers3

11

It doesn't use reflection when it doesn't need to. It just compare values bit by bit in case the struct if it can do so. However, there if any of the struct members (or members of members, any descendants) override object.Equals and provide its own implementation, obviously, it can't rely on bit-by-bit comparison to calculate the return value.

The reason it's slow is that the parameter to Equals is of type object and value types have to be boxed to be treated as an object. Boxing involves allocating memory on the heap and memory copying the value type to that location.

You could manually provide an overload for the Equals method that takes your own struct as parameter to prevent boxing:

public bool Equals(MyStruct obj) {
     return obj.i == i;
}
Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • 5
    It uses reflection in some cases. If it detects it can just blit the results, it does so - but if there are reference types (or types containing reference types) in the fields, it has to do a more painful process. – Jon Skeet Jun 17 '09 at 20:43
  • 1
    While I was writing this, I read around and I found that some .Net frameworks authors (Cwalina, Abrams) just confirm that Equals is using reflection on value types. But maybe just in Framework 2.0 ? – Sylvain Rodrigue Jun 17 '09 at 20:48
  • 2
    Sylvain: They are right. As Jon said, if the struct contains reference types as members, it has to call Equals on that fields. I updated the answer to reflect that. The point I was trying to make is that it doesn't use reflection when it doesn't need so (like your example). – Mehrdad Afshari Jun 17 '09 at 20:51
  • 2
    I have to say, I don't understand why a bit-wise comparison can't be made when there are references. If two references point to the same object, when would the pointers not be exactly equal? – snarf Jun 17 '09 at 21:04
  • 2
    @Snarfblam: From the code Mehrdad posted, it seems as if it calls Equals on each of the referenced objects contained within the struct - almost recursively. If two objects are equal but their references aren't, a bit-wise comparison will fail. – Samir Talwar Jun 17 '09 at 21:12
  • "To use the standard implementation of Equals, your value type must be boxed and passed as an instance of the reference type System.ValueType. The Equals method then uses reflection to perform the comparison." - http://msdn.microsoft.com/en-us/library/ff647790.aspx – Ludington Feb 28 '14 at 01:59
  • Regarding the assertion that the reason it's slow is because of boxing: my understanding is that allocating objects on the heap, even with .NET (or more so since .NET can defrag the heap), is normally a fast operation. Reflection is generally a slow operation where you navigate memory based on metadata in an incredibly unoptimized manner. I don't know the metrics on how long the different operations take, but my money is on reflection being the culprit. – snarf Jul 27 '18 at 00:00
11

The following is the decompiled ValueType.Equals method from mscorlib:

public override bool Equals(object obj)
{
    if (obj == null)
    {
        return false;
    }
    RuntimeType type = (RuntimeType) base.GetType();
    RuntimeType type2 = (RuntimeType) obj.GetType();
    if (type2 != type)
    {
        return false;
    }
    object a = this;
    if (CanCompareBits(this))
    {
        return FastEqualsCheck(a, obj);
    }
    FieldInfo[] fields = type.GetFields(BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance);
    for (int i = 0; i < fields.Length; i++)
    {
        object obj3 = ((RtFieldInfo) fields[i]).InternalGetValue(a, false);
        object obj4 = ((RtFieldInfo) fields[i]).InternalGetValue(obj, false);
        if (obj3 == null)
        {
            if (obj4 != null)
            {
                return false;
            }
        }
        else if (!obj3.Equals(obj4))
        {
            return false;
        }
    }
    return true;
}

When possible, a bit-wise comparison will be done (note the CanCompareBits and FastEqualsCheck, both of which are defined as InternalCall. The JIT would presumably inject the appropriate code here. As to why it is so slow, I couldn't tell you.

snarf
  • 2,684
  • 1
  • 23
  • 26
  • 3
    I wonder if there would be any compatibility problems if the runtime were to auto-generate an `Equals` override for any struct which didn't already define one: `bool Equals(object other) { return StructComparer.EqualsProc(ref this, other); }`, where `EqualsProc` was a static delegate field within static class `StructComparer`? Such an approach would avoid having to use Reflection every time an object is compared, and could also avoid a boxing step. – supercat Nov 13 '13 at 20:26
3

The idea of a compiler generated function is justified.

Thinking about the effects however I think the language design team did it right. Compilergenerated methods known from C++ are hard to understand for beginners. Lets see what would happen in C# with autogenerated struct.Equals:

As it is now, the concept of .Equals() is simple:

  • Every struct inherits Equals from ValueType.
  • If overriden, the custom Equals method applies.

If the compiler would always create the Equals method, we could have:

  • Every struct inherits Equals from Object. (ValueType would no longer implement its own version)
  • Object.Equals is now always(!) overriden, either by the compiler generated Equals method or by the users implementation

Now our struct has an autogenerated override method that the code reader does not see! So how do you know that the base method Object.Equals does not apply to your struct? By learning all the cases of automatically compiler generated methods. And this is exactly one of the burdens learning C++.

Would consider it good decision to leave efficient struct Equals to the user and keep the concepts simple, requiring a standard default Equals method.

That said, performance critical structs should override Equals. The code below shows

3606 vs 53 Milliseconds measured on .Net 4.5.1

This performance gain is certainly due to avoiding virtual Equals, but anyway, so if the virtual Object.Equals would be called the gain would be much lower. Performance critical cases will not call Object.Equals however, so the gain here would apply.

using System;
using System.Diagnostics;

struct A
{
    public int X;
    public int Y;
}

struct B : IEquatable<B>
{
    public bool Equals(B other)
    {
        return this.X == other.X && this.Y == other.Y;
    }

    public override bool Equals(object obj)
    {
        return obj is B && Equals((B)obj);
    }

    public int X;
    public int Y;
}


class Program
{
    static void Main(string[] args)
    {
        var N = 100000000;

        A a = new A();
        a.X = 73;
        a.Y = 42;
        A aa = new A();
        a.X = 173;
        a.Y = 142;

        var sw = Stopwatch.StartNew();
        for (int i = 0; i < N; i++)
        {
            if (a.Equals(aa))
            {
                Console.WriteLine("never ever");
            }
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);

        B b = new B();
        b.X = 73;
        b.Y = 42;
        B bb = new B();
        b.X = 173;
        b.Y = 142;

        sw = Stopwatch.StartNew();
        for (int i = 0; i < N; i++)
        {
            if (b.Equals(bb))
            {
                Console.WriteLine("never ever");
            }
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);
    }
}

see also http://blog.martindoms.com/2011/01/03/c-tip-override-equals-on-value-types-for-better-performance/

citykid
  • 9,916
  • 10
  • 55
  • 91
  • It's worth noting that the compiler doesn't use Reflection; it simply uses virtual method dispatch to a method `ValueType.Equals`; because that method expects `this` to be a class type [`ValueType` is, despite its name, a class] the value needs to be boxed. Conceptually, it might have been nice if `ValueType` had defined a static method `ValueTypeEquals(ref T it, Object other) { ValueTypeComparer.Compare(ref it, other);` and recommended that when possible compilers call that in preference to the virtual `Equals` method. Such an approach could have... – supercat Nov 21 '13 at 03:41
  • ...enabled the runtime to generate a comparer for each type only on the first time it's used, and then access that comparer directly on subsequent invocations. – supercat Nov 21 '13 at 03:42
  • 1
    Nitpick: The `ReferenceEquals(null, obj)` call is technically redundant as [the `is` expression evaluates to false if the provided expression (`obj`) is null](https://msdn.microsoft.com/en-us/library/scekt9xw.aspx). I'm sure it doesn't affect the results of the benchmark in any useful way though. Still, I wouldn't rely on the compiler to optimize it out if it would ever matter. – tne Feb 02 '15 at 14:50