0

I have a List<CustomObject> and want to remove duplicates from it.
If two Custom Objects have same value for property: City, then I will call them duplicate.
I have implemented IEquatable as follows, but not able to remove duplicates from the list.

What is missing?

 public class CustomAddress : IAddress, IEqualityComparer<IAddress>
 {
    //Other class members go here

    //IEqualityComparer members
    public bool Equals(IAddress x, IAddress y)
    {
        // Check whether the compared objects reference the same data.
        if (ReferenceEquals(x, y)) return true;

        // Check whether any of the compared objects is null.
        if (ReferenceEquals(x, null) || ReferenceEquals(y, null))
            return false;

        // Check whether the Objects' properties are equal.
        return x.City.Equals(y.City);

    }

    public int GetHashCode(IAddress obj)
    {
        // Check whether the object is null.
        if (ReferenceEquals(obj, null)) return 0;

        int hashAreaName = City == null ? 0 : City.GetHashCode();
        return hashAreaName;
    }
 }

I am using .NET 3.5

inutan
  • 10,558
  • 27
  • 84
  • 126

4 Answers4

1

With your overrides of Equals and GetHashCode in place, if you have an existing list that you need to filter, simply invoke Distinct() (available through the namespace System.Linq) on the list.

var noDupes = list.Distinct();

This will give you a duplicate-free sequence. If you need that to be a concrete list, simply add a ToList() to the end of the invocation.

var noDupes = list.Distinct().ToList();

Another answer mentions implementing an IEqualityComparer<CustomObject>. This is useful when overriding Equals and GetHashCode directly is either impossible (you don't control the source) or does not make sense (your idea of equality in this particular case is not universal for the class). In that case, define the comparer as demonstrated and provide an instance of the comparer to an overload of Distinct.

Finally, if you're building a list from the ground-up and want to avoid duplicates being inserted, you can use a HashSet<T> as mentioned here. The HashSet also accepts a custom comparer in the constructor, so you can optionally include that.

var mySet = new HashSet<CustomObject>();
bool isAdded = mySet.Add(myElement); 
// isAdded will be false if myElement already exists in set, and 
// myElement would not be added a second time.
// or you could use 
if (!mySet.Contains(myElement))
     mySet.Add(myElement);

One more option that is not using .NET library methods but can be useful in a pinch is Jon Skeet's DistinctBy, which you can see a rough implementation here. The idea is that you submit a Func<MyObject, Key> lambda expression directly and omit the overrides of Equals and GetHashCode (or the custom comparer) entirely.

 var noDupes = list.DistinctBy(obj => obj.City); // NOT part of BCL
Community
  • 1
  • 1
Anthony Pegram
  • 123,721
  • 27
  • 225
  • 246
0

To match duplicates on only a specific property you need a comparer.

class MyComparer : IEqualityComparer<CustomObject>
{
    public bool Equals(CustomObject x, CustomObject y)
    {
        return x.City.Equals(y.City);
    }

    public int GetHashCode(CustomObject x)
    {
        return x.City.GetHashCode()
    }
}

Usage:

var yourDistictObjects = youObjects.Distinct(new MyComparer()); 

Edit: Found this thread that does what you need and I think I referred to it in the past:

Remove duplicates in the list using linq

One answer that I thought was kind of interesting (but not how had done it) was:

var distinctItems = items.GroupBy(x => x.Id).Select(y => y.First());

It's a one liner that does what you need but might not be as efficient as the other methods.

Community
  • 1
  • 1
Kelsey
  • 47,246
  • 16
  • 124
  • 162
  • This is slightly misleading, the user doesn't *need* a comparer, he/she has overriden Equals and GetHashCode. But that doesn't make the answer invalid. When would you favor the external comparer versus override Equals and GetHashCode as the user has done? – Anthony Pegram Oct 19 '11 at 14:09
  • @Anthony Pegram good question :) I looked up how I had done it in another project and cut and paste it... now you have me thinking about why I didn't do it that way too lol. It works though. – Kelsey Oct 19 '11 at 14:13
  • Thanks for your comments, wondering why it does not work out if my CustomAddress class implements IEqualityComparer, why it is required to declare a separate Comparer class? – inutan Oct 19 '11 at 14:36
  • @iniki, You don't want to implement `IEqualityComparer` inside the class that is being compared. As for why you have to declare a separate instance, `Distinct` takes the comparer you provide and uses it to get item hashcodes and further to test for equality between items as necessary. If you do not provide such an instance via the constructor, it will use the default comparer, ie., check the `GetHashCode` and `Equals` methods as defined on the object. – Anthony Pegram Oct 19 '11 at 14:54
  • 1
    @iniki, Note: by changing your override of `Equals` to instead an implementation of `IEqualityComparer`, you're no longer providing a valid `Equals` implementation in the object itself, and now `Distinct` will be looking at the default equality comparison for class objects, which will be by reference. If you revert the code to the original snippet, you can utilize `Distinct()` directly without needing a custom comparer implementation. – Anthony Pegram Oct 19 '11 at 14:54
  • @Anthony Pegram ah this starting to remind me... I created this comparer in my project for a very specific compare I wanted to do and that is why I went this route. I think what you said is 100% correct and in his case I don't think he should override the `Equals` in his class. – Kelsey Oct 19 '11 at 16:08
0

Just by implementing .Equals the way you did (wich you implemented correctly) you will not prevent duplicates from beeing added to a List<T>. You will actually have to manually remove them.

Instead of List<CustomObject> use HashSet<CustomObject>. It will never contain duplicates.

bitbonk
  • 48,890
  • 37
  • 186
  • 278
0

That's because List<CustomObject> tests if your class ( CustomObject) implements IEquatable<CustomObject> and not IEquatable<IAddress> as you did

I assume that for duplicate check you are using the Contains method, before adding a new member

Panos Theof
  • 1,450
  • 1
  • 21
  • 27