119

I have some bells in my database with the same number. I want to get all of them without duplication. I created a compare class to do this work, but the execution of the function causes a big delay from the function without distinct, from 0.6 sec to 3.2 sec!

Am I doing it right or do I have to use another method?

reg.AddRange(
    (from a in this.dataContext.reglements
     join b in this.dataContext.Clients on a.Id_client equals b.Id
     where a.date_v <= datefin && a.date_v >= datedeb
     where a.Id_client == b.Id
     orderby a.date_v descending 
     select new Class_reglement
     {
         nom  = b.Nom,
         code = b.code,
         Numf = a.Numf,
     })
    .AsEnumerable()
    .Distinct(new Compare())
    .ToList());

class Compare : IEqualityComparer<Class_reglement>
{
    public bool Equals(Class_reglement x, Class_reglement y)
    {
        if (x.Numf == y.Numf)
        {
            return true;
        }
        else { return false; }
    }
    public int GetHashCode(Class_reglement codeh)
    {
        return 0;
    }
}
CarenRose
  • 1,266
  • 1
  • 12
  • 24
Akrem
  • 5,033
  • 8
  • 37
  • 64
  • 17
    You might want to take a look at [Guidelines and rules for GetHashCode](http://blogs.msdn.com/b/ericlippert/archive/2011/02/28/guidelines-and-rules-for-gethashcode.aspx) – Conrad Frix Jul 14 '11 at 14:15
  • 4
    This blog explains how to use the IEqualityComparer perfectly: http://blog.alex-turok.com/2013/03/c-linq-and-iequalitycomparer.html – Jeremy Ray Brown May 13 '18 at 23:02

7 Answers7

188

Your GetHashCode implementation always returns the same value. Distinct relies on a good hash function to work efficiently because it internally builds a hash table.

When implementing interfaces of classes it is important to read the documentation, to know which contract you’re supposed to implement.1

In your code, the solution is to forward GetHashCode to Class_reglement.Numf.GetHashCode and implement it appropriately there.

Apart from that, your Equals method is full of unnecessary code. It could be rewritten as follows (same semantics, ¼ of the code, more readable):

public bool Equals(Class_reglement x, Class_reglement y)
{
    return x.Numf == y.Numf;
}

Lastly, the ToList call is unnecessary and time-consuming: AddRange accepts any IEnumerable so conversion to a List isn’t required. AsEnumerable is also redundant here since processing the result in AddRange will cause this anyway.


1 Writing code without knowing what it actually does is called cargo cult programming. It’s a surprisingly widespread practice. It fundamentally doesn’t work.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 21
    Your Equals fails when either x or y is null. – dzendras Dec 17 '12 at 12:46
  • 4
    @dzendras Same for `GetHashCode`. However, note that the [documentation of `IEqualityComparer`](http://msdn.microsoft.com/en-us/library/ms132151.aspx) doesn’t specify what to do with `null` arguments – but the examples provided in the article don’t handle `null` either. – Konrad Rudolph Dec 17 '12 at 12:55
  • 66
    Wow. *Abomination* is unnecessarily harsh. We are here to help each other, not insult. I guess it makes some people laugh, but I would recommend removing it. – Jess Sep 28 '16 at 19:40
  • 5
    +1 for making me read about "cargo cult programming" on wiki and then changing my skype tag line to "// Deep magic begins here...followed by some heavy wizardry". – Alex Dec 13 '16 at 18:53
  • 3
    @dzendras Using C#6 Null-conditional https://msdn.microsoft.com/en-us/library/dn986595.aspx he could do `return x?.Numf == y?.Numf;` – Luke T O'Brien Feb 08 '17 at 17:05
  • 2
    It is a bit harsh to judge somebody's code an abomination and to assume they don't understand what they are doing - Nobody can know everything. You could of also pointed the the question https://stackoverflow.com/questions/371328/why-is-it-important-to-override-gethashcode-when-equals-method-is-overridden to help explain a bit more – Luke T O'Brien Jun 16 '17 at 13:24
  • 1
    @LukeTO'Brien You’ve made me see the light. As it were. – Konrad Rudolph Jun 16 '17 at 13:46
  • 3
    'When implementing interfaces of classes you need to read their documentation first' and 'Implementing code without knowing what it actually doesn’t work.' - the guy doesn't need a lecture he's asking for help. Not everybody is as good at computer programming as you think you are! – Neil Benn Jul 04 '17 at 16:04
  • 5
    @NeilBenn You mistake frank advice for rudeness. Since OP accepted the answer (and, I might note, in a much sterner version!), they didn’t seem to make the same mistake. I’m not sure why you think that giving advice is rude, but you’re mistaken when you say that “the guy doesn't need a lecture”. I strongly disagree: the lecture *was* needed, and it was taken to heart. The code, as written, was bad, and based on bad work practice. Not pointing this out would be a disservice, and not at all helpful, since then the OP couldn’t improve how they work. – Konrad Rudolph Jul 04 '17 at 16:12
  • @KonradRudolph Strange. It seems to be working now. Before when I opened `documentation` link I got a 404. – erikvimz Oct 20 '17 at 12:55
  • @Fütemire On the contrary. Cargo cult programming is highly relevant in this context. As somebody who used to to this myself, extensively, I remember well that it took me recognising what I was doing (= just writing code without understanding what I was doing, or why) to stop doing it. If there’s another way of concisely conveying this information that’s seen as condescending by fewer people I’m all ears. Until then I prefer to err on the side of actually giving advice. I strongly disagree that this is in conflict with the code of conduct. – Konrad Rudolph Feb 11 '20 at 08:23
  • @KonradRudolph I applaud you for taking advice and making changes to your post to make it a bit less direct. I have reversed my down vote. – Fütemire Feb 12 '20 at 22:10
55

Try This code:

public class GenericCompare<T> : IEqualityComparer<T> where T : class
{
    private Func<T, object> _expr { get; set; }
    public GenericCompare(Func<T, object> expr)
    {
        this._expr = expr;
    }
    public bool Equals(T x, T y)
    {
        var first = _expr.Invoke(x);
        var sec = _expr.Invoke(y);
        if (first != null && first.Equals(sec))
            return true;
        else
            return false;
    }
    public int GetHashCode(T obj)
    {
        return obj.GetHashCode();
    }
}

Example of its use would be

collection = collection
    .Except(ExistedDataEles, new GenericCompare<DataEle>(x=>x.Id))
    .ToList(); 
MoonKnight
  • 23,214
  • 40
  • 145
  • 277
suneelsarraf
  • 923
  • 9
  • 7
  • 22
    `GetHashCode` needs to use the expression as well: `return _expr.Invoke(obj).GetHashCode();` See [this post](http://stackoverflow.com/q/25513372/450913) for a usage of it. – orad Aug 26 '14 at 21:12
  • 1
    shouldn't this fail if collections contains a null? however quick experiment on VS C# Interactive doesn't seem to throw null ref exception! – Social Developer Jan 26 '21 at 12:56
10

If you want a generic solution that creates an IEqualityComparer for your class based on a property (which acts as a key) of that class have a look at this:

public class KeyBasedEqualityComparer<T, TKey> : IEqualityComparer<T>
{
    private readonly Func<T, TKey> _keyGetter;

    public KeyBasedEqualityComparer(Func<T, TKey> keyGetter)
    {
        if (default(T) == null)
        {
            _keyGetter = (x) => x == null ? default : keyGetter(x);
        }
        else
        {
            _keyGetter = keyGetter;
        }
    }

    public bool Equals(T x, T y)
    {
        return EqualityComparer<TKey>.Default.Equals(_keyGetter(x), _keyGetter(y));
    }

    public int GetHashCode(T obj)
    {
        TKey key = _keyGetter(obj);

        return key == null ? 0 : key.GetHashCode();
    }
}

public static class KeyBasedEqualityComparer<T>
{
    public static KeyBasedEqualityComparer<T, TKey> Create<TKey>(Func<T, TKey> keyGetter)
    {
        return new KeyBasedEqualityComparer<T, TKey>(keyGetter);
    }
}

For better performance with structs there isn't any boxing.

Usage is like this:

IEqualityComparer<Class_reglement> equalityComparer =
  KeyBasedEqualityComparer<Class_reglement>.Create(x => x.Numf);
user764754
  • 3,865
  • 2
  • 39
  • 55
  • Can you elaborate on "without boxing" part? What exactly did you do, in comparison to other solutuions here, that avoids boxing? – Kirikan Apr 04 '21 at 09:24
  • @Kirikan It's a generic solution that works with any type (not just `Class_reglement`). And there are no casts to `object` (or any casts) so there is no performance overhead. This is especially relevant if the type is a struct. – user764754 Apr 05 '21 at 10:07
3

Just code, with implementation of GetHashCode and NULL validation:

public class Class_reglementComparer : IEqualityComparer<Class_reglement>
{
    public bool Equals(Class_reglement x, Class_reglement y)
    {
        if (x is null || y is null))
            return false;

        return x.Numf == y.Numf;
    }

    public int GetHashCode(Class_reglement product)
    {
        //Check whether the object is null 
        if (product is null) return 0;

        //Get hash code for the Numf field if it is not null. 
        int hashNumf = product.hashNumf == null ? 0 : product.hashNumf.GetHashCode();

        return hashNumf;
    }
}

Example: list of Class_reglement distinct by Numf

List<Class_reglement> items = items.Distinct(new Class_reglementComparer());
Community
  • 1
  • 1
Shahar Shokrani
  • 7,598
  • 9
  • 48
  • 91
3

The purpose of this answer is to improve on previous answers by:

  • making the lambda expression optional in the constructor so that full object equality can be checked by default, not just on one of the properties.
  • operating on different types of classes, even complex types including sub-objects or nested lists. And not only on simple classes comprising only primitive type properties.
  • Not taking into account possible list container differences.
  • Here, you'll find a first simple code sample that works only on simple types (the ones composed only by primitif properties), and a second one that is complete (for a wider range of classes and complex types).

Here is my 2 pennies try:

public class GenericEqualityComparer<T> : IEqualityComparer<T> where T : class
{
    private Func<T, object> _expr { get; set; }

    public GenericEqualityComparer() => _expr = null;

    public GenericEqualityComparer(Func<T, object> expr) => _expr = expr;

    public bool Equals(T x, T y)
    {
        var first = _expr?.Invoke(x) ?? x;
        var sec = _expr?.Invoke(y) ?? y;

        if (first == null && sec == null)
            return true;

        if (first != null && first.Equals(sec))
            return true;

        var typeProperties = typeof(T).GetProperties();

        foreach (var prop in typeProperties)
        {
            var firstPropVal = prop.GetValue(first, null);
            var secPropVal = prop.GetValue(sec, null);

            if (firstPropVal != null && !firstPropVal.Equals(secPropVal))
                return false;
        }

        return true;
    }

    public int GetHashCode(T obj) =>
        _expr?.Invoke(obj).GetHashCode() ?? obj.GetHashCode();
}

I know we can still optimize it (and maybe use a recursive?).. But that is working like a charm without this much complexity and on a wide range of classes. ;)

Edit: After a day, here is my $10 attempt: First, in a separate static extension class, you'll need:

public static class CollectionExtensions
{
    public static bool HasSameLengthThan<T>(this IEnumerable<T> list, IEnumerable<T> expected)
    {
        if (list.IsNullOrEmptyCollection() && expected.IsNullOrEmptyCollection())
            return true;

        if ((list.IsNullOrEmptyCollection() && !expected.IsNullOrEmptyCollection()) || (!list.IsNullOrEmptyCollection() && expected.IsNullOrEmptyCollection()))
            return false;

        return list.Count() == expected.Count();
    }

    /// <summary>
    /// Used to find out if a collection is empty or if it contains no elements.
    /// </summary>
    /// <typeparam name="T">Type of the collection's items.</typeparam>
    /// <param name="list">Collection of items to test.</param>
    /// <returns><c>true</c> if the collection is <c>null</c> or empty (without items), <c>false</c> otherwise.</returns>
    public static bool IsNullOrEmptyCollection<T>(this IEnumerable<T> list) => list == null || !list.Any();
}

Then, here is the updated class that works on a wider range of classes:

public class GenericComparer<T> : IEqualityComparer<T> where T : class
{
    private Func<T, object> _expr { get; set; }

    public GenericComparer() => _expr = null;

    public GenericComparer(Func<T, object> expr) => _expr = expr;

    public bool Equals(T x, T y)
    {
        var first = _expr?.Invoke(x) ?? x;
        var sec = _expr?.Invoke(y) ?? y;

        if (ObjEquals(first, sec))
            return true;

        var typeProperties = typeof(T).GetProperties();

        foreach (var prop in typeProperties)
        {
            var firstPropVal = prop.GetValue(first, null);
            var secPropVal = prop.GetValue(sec, null);

            if (!ObjEquals(firstPropVal, secPropVal))
            {
                var propType = prop.PropertyType;

                if (IsEnumerableType(propType) && firstPropVal is IEnumerable && !ArrayEquals(firstPropVal, secPropVal))
                    return false;

                if (propType.IsClass)
                {
                    if (!DeepEqualsFromObj(firstPropVal, secPropVal, propType))
                        return false;

                    if (!DeepObjEquals(firstPropVal, secPropVal))
                        return false;
                }
            }
        }

        return true;
    }

    public int GetHashCode(T obj) =>
        _expr?.Invoke(obj).GetHashCode() ?? obj.GetHashCode();

    #region Private Helpers

    private bool DeepObjEquals(object x, object y) =>
        new GenericComparer<object>().Equals(x, y);

    private bool DeepEquals<U>(U x, U y) where U : class =>
        new GenericComparer<U>().Equals(x, y);

    private bool DeepEqualsFromObj(object x, object y, Type type)
    {
        dynamic a = Convert.ChangeType(x, type);
        dynamic b = Convert.ChangeType(y, type);
        return DeepEquals(a, b);
    }

    private bool IsEnumerableType(Type type) =>
        type.GetInterface(nameof(IEnumerable)) != null;

    private bool ObjEquals(object x, object y)
    {
        if (x == null && y == null) return true;
        return x != null && x.Equals(y);
    }

    private bool ArrayEquals(object x, object y)
    {
        var firstList = new List<object>((IEnumerable<object>)x);
        var secList = new List<object>((IEnumerable<object>)y);

        if (!firstList.HasSameLengthThan(secList))
            return false;

        var elementType = firstList?.FirstOrDefault()?.GetType();
        int cpt = 0;
        foreach (var e in firstList)
        {
            if (!DeepEqualsFromObj(e, secList[cpt++], elementType))
                return false;
        }

        return true;
    }

    #endregion Private Helpers

We can still optimize it but it worth give it a try ^^.

Gert Arnold
  • 105,341
  • 31
  • 202
  • 291
  • I've just edited with a `GenericComparer` class that works also on types with other objects or lists as nested members. – Islem BEZZARGA Feb 09 '22 at 17:46
  • @GertArnold, a part from saying what I already said, that this should work on a wider range of classes including object with nested objects or array inside, I really don't know what to add... It does not cook for you if that was your only concerned. :) – Islem BEZZARGA Feb 09 '22 at 19:19
  • @GertArnold: I've just edited the top of this answer for you ;) with an _**Edit 2**_ part. I really thought that this was kind of obvious from my code snippet. Anyway, I'll get this good habit now, thank you. – Islem BEZZARGA Feb 09 '22 at 23:42
2

The inclusion of your comparison class (or more specifically the AsEnumerable call you needed to use to get it to work) meant that the sorting logic went from being based on the database server to being on the database client (your application). This meant that your client now needs to retrieve and then process a larger number of records, which will always be less efficient that performing the lookup on the database where the approprate indexes can be used.

You should try to develop a where clause that satisfies your requirements instead, see Using an IEqualityComparer with a LINQ to Entities Except clause for more details.

Community
  • 1
  • 1
Justin
  • 84,773
  • 49
  • 224
  • 367
1

IEquatable<T> can be a much easier way to do this with modern frameworks.

You get a nice simple bool Equals(T other) function and there's no messing around with casting or creating a separate class.

public class Person : IEquatable<Person>
{
    public Person(string name, string hometown)
    {
        this.Name = name;
        this.Hometown = hometown;
    }

    public string Name { get; set; }
    public string Hometown { get; set; }

    // can't get much simpler than this!
    public bool Equals(Person other)
    {
        return this.Name == other.Name && this.Hometown == other.Hometown;
    }

    public override int GetHashCode()
    {
        return Name.GetHashCode();  // see other links for hashcode guidance 
    }
}

Note you DO have to implement GetHashCode if using this in a dictionary or with something like Distinct.

PS. I don't think any custom Equals methods work with entity framework directly on the database side (I think you know this because you do AsEnumerable) but this is a much simpler method to do a simple Equals for the general case.

If things don't seem to be working (such as duplicate key errors when doing ToDictionary) put a breakpoint inside Equals to make sure it's being hit and make sure you have GetHashCode defined (with override keyword).

Simon_Weaver
  • 140,023
  • 84
  • 646
  • 689