12

This is a followup question to this: List<T>.Contains and T[].Contains behaving differently

T[].Contains is behaving differently when T is class and struct. Suppose I have this struct:

public struct Animal : IEquatable<Animal>
{
    public string Name { get; set; }

    public bool Equals(Animal other) //<- he is the man
    {
        return Name == other.Name;
    }
    public override bool Equals(object obj)
    {
        return Equals((Animal)obj);
    }
    public override int GetHashCode()
    {
        return Name == null ? 0 : Name.GetHashCode();
    }
}

var animals = new[] { new Animal { Name = "Fred" } };

animals.Contains(new Animal { Name = "Fred" }); // calls Equals(Animal)

Here, generic Equals is rightly called as I expected.

But in case of a class:

public class Animal : IEquatable<Animal>
{
    public string Name { get; set; }

    public bool Equals(Animal other)
    {
        return Name == other.Name;
    }
    public override bool Equals(object obj) //<- he is the man
    {
        return Equals((Animal)obj);
    }
    public override int GetHashCode()
    {
        return Name == null ? 0 : Name.GetHashCode();
    }
}

var animals = new[] { new Animal { Name = "Fred" } };

animals.Contains(new Animal { Name = "Fred" }); // calls Equals(object)

The non generic Equals is called, taking away the benefit of implementing `IEquatable.

Why is array calling Equals differently for struct[] and class[], even though both the collections seem to look generic?

The array weirdness is so frustrating that I'm thinking of avoiding it totally...

Note: The generic version of Equals is called only when the struct implements IEquatable<T>. If the type doesn't implement IEquatable<T>, non-generic overload of Equals is called irrespective of whether it is class or struct.

Community
  • 1
  • 1
nawfal
  • 70,104
  • 56
  • 326
  • 368
  • Have you tried passing a comparer explicitly to Contains? That would circumvent the code that ends up calling IndexOf (which may be what's causing you trouble). – hatchet - done with SOverflow Nov 10 '13 at 10:06
  • @hatchet excellent point, that will work, the source code reflects it. I know I can change the whole `T[]` thing. I was just learning. – nawfal Nov 10 '13 at 10:10
  • I agree it seems odd. I also would expect the overriden Equals(object) to be called in both cases, even if IndexOf is what's doing work of the Contains. Have you put in breakpoints to verify that Animal's Equals is not getting called? – hatchet - done with SOverflow Nov 10 '13 at 10:20
  • 1
    Seems odd. But the main benefit of `IEquatable` is that it avoids boxing, which only matters for structs. There is still a minor gain from the generic method for classes, but the benefit is far smaller compared with structs. – CodesInChaos Nov 10 '13 at 10:22
  • @CodesInChaos agree, I'm aware. I noticed a bug in code when using `Contains` on arrays after having forgotten to override non generic `Equals` (so `Contains` did a reference equality check). – nawfal Nov 11 '13 at 05:19

3 Answers3

4

It appears that it's not actually Array.IndexOf() that ends up getting called. Looking at the source for that, I would have expected the Equals(object) to get called in both cases if that were the case. By looking at the stack trace at the point where the Equals gets called, it makes it more clear why you're getting the behavior you're seeing (value type gets Equals(Animal), but reference type gets Equals(object).

Here is the stack trace for the value type (struct Animal)

at Animal.Equals(Animal other)
at System.Collections.Generic.GenericEqualityComparer`1.IndexOf(T[] array, T value, Int32 startIndex, Int32 count)
at System.Array.IndexOf[T](T[] array, T value, Int32 startIndex, Int32 count)
at System.Array.IndexOf[T](T[] array, T value)
at System.SZArrayHelper.Contains[T](T value)
at System.Linq.Enumerable.Contains[TSource](IEnumerable`1 source, TSource value) 

Here is the stack trace for the reference type (object Animal)

at Animal.Equals(Object obj)
at System.Collections.Generic.ObjectEqualityComparer`1.IndexOf(T[] array, T value, Int32 startIndex, Int32 count)
at System.Array.IndexOf[T](T[] array, T value, Int32 startIndex, Int32 count)
at System.Array.IndexOf[T](T[] array, T value)
at System.SZArrayHelper.Contains[T](T value)
at System.Linq.Enumerable.Contains[TSource](IEnumerable`1 source, TSource value)

From this you can see that it's not Array.IndexOf that's getting called - it's Array.IndexOf[T]. That method does end up using Equality comparers. In the case of the reference type, it uses ObjectEqualityComparer which call Equals(object). In the case of the value type, it uses GenericEqualityComparer which calls Equals(Animal), presumably to avoid an expensive boxing.

If you look at the source code for IEnumerable at http://www.dotnetframework.org it has this interesting bit at the top:

// Note that T[] : IList<t>, and we want to ensure that if you use
// IList<yourvaluetype>, we ensure a YourValueType[] can be used
// without jitting.  Hence the TypeDependencyAttribute on SZArrayHelper.
// This is a special hack internally though - see VM\compile.cpp.
// The same attribute is on IList<t> and ICollection<t>.
[TypeDependencyAttribute("System.SZArrayHelper")]

I'm not familiar with TypeDependencyAttribute, but from the comment, I'm wondering if there is some magic going on that's special for Array. This may explain how IndexOf[T] ends up getting called instead of IndexOf via Array's IList.Contains.

  • Is there any way to tell what type is being assumed for a generic call? In particular, if one were to call `IndexOf` or `Contains` on a `Cat[]` that was cast to `Animal[]`, would the type `T` in those generic method calls be `Cat` or `Animal`? I would guess it would have to be either `Animal` or `Object`, but neither would really fit with `IEquatable`. – supercat Nov 11 '13 at 22:07
  • @supercat - if you call IndexOf(), the IndexOf() gets called for the Cat array, which in turn calls Cat.Equals(Cat) via GenericEqualityComparer. For the array cast to Animal[], Animal.Equals(Animal) gets called via GenericEqualityComparer. However, if you call Contains Cat.Equals(Object) gets called via ObjectEqualityComparer from IndexOf regardless whether you use the Cat array or the array cast to Animal. I'm guessing this is because calling IndexOf directly has the type inferred by the compiler, but calling Contains has some low level magic going. – hatchet - done with SOverflow Nov 11 '13 at 23:09
  • and my previous comment was using reference types for Cat and Animal. – hatchet - done with SOverflow Nov 11 '13 at 23:15
  • @hatchet: The stack trace shows that some generic `szHelper` method is called on the array, but it doesn't show whether calling `Contains` is calling `szHelper` or `szHelper`, or whether that method is calling `Array.IndexOf` or `Array.IndexOf`. – supercat Nov 11 '13 at 23:34
  • @supercat - I was just stepping through that code, and it goes from the `return collection.Contains(value);` in `Enumerable.Contains`, where T is Cat, right into `SzArrayHelper.Contains` where T is object. You should look at the previous question to this one (link at the top of this question). There is more about this there. – hatchet - done with SOverflow Nov 12 '13 at 00:23
0

I think its because they are both using their own base implementation of Equals

Classes inherit Object.Equals which implements identity equality, Structs inherit ValueType.Equals which implements value equality.

sa_ddam213
  • 42,848
  • 7
  • 101
  • 110
0

The primary purpose of IEquatable<T> is to allow reasonably-efficient equality comparisons with generic structure types. It is intended that IEquatable<T>.Equals((T)x) should behave exactly like Equals((object)(T)x); except that if T is a value type the former will avoid a heap allocation which will be required for the latter. Although IEquatable<T> does not constrain T to be a struct type, and sealed classes may in some cases receive a slight performance benefit from using it, class types cannot receive nearly as much benefit from that interface as do struct types. A properly-written class may perform slightly faster if outside code uses IEquatable<T>.Equals(T) instead of Equals(Object), but should otherwise not care which comparison method is used. Because the performance advantage of using IEquatable<T> with classes is never very large, code which knows it's using a class type might decide that the time required to check whether the type happens to implement IEquatable<T> would likely not be recouped by any performance gain the interface could plausibly offer.

Incidentally, it's worth noting that if X and Y are "normal" classes, X.Equals(Y) may legitimately be true if either X or Y derives from the other. Further, a variable of an unsealed class type may legitimately compare equal to one of any interface type whether or not the class implements that interface. By comparison, a structure can only compare equal to a variable of its own type, Object, ValueType, or an interface which the structure itself implements. The fact that class-type instances may be "equal" to a much wider range of variable types means that the IEquatable<T> isn't as applicable with them as with structure types.

PS--There's another reason arrays are special: they support a style of covariance which classes cannot. Given

Dog Fido = new Dog();
Cat Felix = new Cat();
Animal[] meows = new Cat[]{Felix};

it is perfectly legal to test meows.Contains(Fido). If meows were replaced with an instance of Animal[] or Dog[], the new array might indeed contain Fido; even if it weren't, one might legitimately have a variable of some unknown type of Animal and want to know if it's contained within meows. Even if Cat implements IEquatable<Cat>, trying to use the IEquatable<Cat>.Equals(Cat) method to test whether an element of meows is equal to Fido would fail because Fido cannot be converted into a Cat. There might be ways for the system to use IEquatable<Cat> when it's workable and Equals(Object) when it isn't, but it would add a lot of complexity, and it would be hard to do without a performance cost which would exceed that of simply using Equals(Object).

supercat
  • 77,689
  • 9
  • 166
  • 211
  • *a variable of an unsealed class type may legitimately compare equal to one of any interface type whether or not the class implements that interface* If it's applicable to reference types, then why not value types? What's stopping me from implementing such a `Equals` for value types? – nawfal Nov 11 '13 at 19:31
  • @nawfal: If some value type `Fnord` doesn't implement `IFoo`, then any variable of type `IFoo` will either be `null`, or will identify some object that isn't a `Fnord`; since a variable of type `Fnord` will always hold a non-null `Fnord`, `Equals` should return false for anything that isn't a non-null `Fnord`. By contrast, if `Zord` is an unsealed class, even if it doesn't implement `IFoo`, it would be possible that there might exist a class `Yord` which derives from `Zord`, which does implement `IFoo`, and variables of types `Zord` and `IFoo` could both identify instances of `Yord`. – supercat Nov 11 '13 at 19:43
  • I understand, thanks. Very good reasoning. But still I don't think that justifies `T[]` implementation of `ICollection.Contains(T)` considering client expects generic `Equals` to be called. All other generic collection has its contains checking for generic `Equals`, what makes arrays special here? The only reason I see is *they did not think its a big deal*. – nawfal Nov 11 '13 at 20:01
  • @nawfal: The client *shouldn't care* which `Equals` gets called, since in a properly-written client they should always behave identically if passed the same thing. As for what makes arrays special, they are fundamentally different from generics. Given `Dog Fido = new Dog(); Cat felix = new Cat(); Animal[] meows = new Cat[] {felix};`, what equality-test method should be called for `meows.Contains(Fido)`? It can't be `IEquatable.Equals(Cat)`, and I don't think the `Cat[]`'s `Contains` method has any way of knowing that it's been typecast to `Animal[]`. – supercat Nov 11 '13 at 20:14