1

I have a list which consists of the following properties:

public class Category 
{    
   public int RecordId { get; set;}
   public string Category { get; set;}
   public Data DataObj { get; set;}
}

My list is defined as List<Category> categories.

The data class holds the following properties:

{
   Id: 1,
   Name: "Smith",
   Input: "7,8",
   Output: "Output1",
   CreatedBy: "swallac",
   CreatedON: "12/01/2018"
},
{
   Id: 3,
   Name: "Austin",
   Input: "9,10",
   Output: "Output1",
   CreatedBy: "amanda",
   CreatedON: "12/03/2018"
},
{
   Id: 2,
   Name: "Austin",
   Input: "9,10",
   Output: "Output1",
   CreatedBy: "amanda",
   CreatedON: "12/03/2018"
}

How can I get the duplicate item in the Data object?

I have tried the following but does not seem to return me the correct results.

 var categoriesFiltered = categories.Select(g => g.DataObj);

 var duplicateDataa = categoriesFiltered.GroupBy(x => x)
                                     .Where(g => g.Count() > 1)
                                     .Select(y => y.Key);
Chatwa
  • 11
  • 3
  • [Distinct](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.distinct?view=net-5.0) is your friend. Implement `IEquatable` in your `Category` class. – Sani Huttunen Apr 29 '21 at 08:38
  • @SaniSinghHuttunen `Distinct()` would not find the duplicate, OP question is "_How can I get the duplicate item in the Data object?_" – Cleptus Apr 29 '21 at 08:38
  • @Cleptus: Then there's two questions. Headline says `Remove duplicates in a list in C#`. – Sani Huttunen Apr 29 '21 at 08:39
  • Indeed, a bit unclear the question – Cleptus Apr 29 '21 at 08:40
  • @Cleptus I want to get the duplicate items first and then remove them. – Chatwa Apr 29 '21 at 08:40
  • There is no duplicate items in your question. The Ids are different. What is the exact criteria for detecting the duplicate items? – Bizhan Apr 29 '21 at 08:41
  • What constitutes a duplicate? All properties being equal other than `Id`? – Johnathan Barclay Apr 29 '21 at 08:42
  • @Bizhan: The properties Name, input output – Chatwa Apr 29 '21 at 08:43
  • @JohnathanBarclay Yes you got it right. – Chatwa Apr 29 '21 at 08:43
  • Does this answer your question? [Group By Multiple Columns](https://stackoverflow.com/questions/847066/group-by-multiple-columns) – Cleptus Apr 29 '21 at 08:44
  • OK, then your problem then boils down to do a grouping with those common properties, I have flagged your question as duplicated, check the link and the asnwer given in that related/duplicated question. It should help you. – Cleptus Apr 29 '21 at 08:48
  • @Cleptus How to remove the duplicate items? – Chatwa Apr 29 '21 at 08:50
  • What if the common properties are generic? How do I do the filtering? – Chatwa Apr 29 '21 at 08:51
  • Tow ways: a) You could do a grouping and retrieve the common properties and the MIN(Id). Then remove all items that have those common properties and their Id is not the Min(id). The option b) Follow Sani Singh's comment [or Bizhan's answer](https://stackoverflow.com/a/67314090/2265446) – Cleptus Apr 29 '21 at 08:54

2 Answers2

2

You can use an Equality Comparer:

    class MyComparer : IEqualityComparer<Category>
    {
        public bool Equals(Category? x, Category? y)
        {
            return x?.DataObj.Name == y?.DataObj.Input &&
                   x?.DataObj.Input == y?.DataObj.Input &&
                   x?.DataObj.Output == y?.DataObj.Output;
        }

        public int GetHashCode(Category obj)
        {
            return obj.DataObj.Name.GetHashCode() +
                   obj.DataObj.Input.GetHashCode() +
                   obj.DataObj.Output.GetHashCode();
        }
    }

Then use that comparer to make a distinction between your items:

 var distinctList = categoriesFiltered.Distinct(new MyComparer());

Alternatively you can implement IEquatable<T> on any class that needs to be compared in an arbitrary way:

    public class Category : IEquatable<Category>
    {
        public int RecordId { get; set;}
        public string Category { get; set;}
        public Data DataObj { get; set;}

        public bool Equals(Category? other)
        {
            return DataObj.Equals(other?.DataObj);
        }
    }
    public class Data : IEquatable<Data>
    {
        ...
        public bool Equals(Data? other)
        {
            return Name.Equals(other?.Name) && Input.Equals(other?.Input) && ...;
        }
    }

Then let C# use them:

 var distinctList = categoriesFiltered.Distinct();

If you want to know which items are removed as duplicate you can use Except:

 duplicates = categoriesFiltered.Except(distinctList);
Bizhan
  • 16,157
  • 9
  • 63
  • 101
  • What if the the `Category` is of type `T`? How can I do that with varying properties knowing that I will have only one property in common, `Id`? – Chatwa Apr 29 '21 at 09:01
  • I want to exclude the `Id` property when doing the comparison though. Please help. – Chatwa Apr 29 '21 at 09:17
  • @Chatwa the original question was actually two different questions with little information provided which can find the complete answer here. But it's difficult to understand your question if you can't describe it properly. So what you said earlier about comparing Name, Input and Output of two `Category` objects was actually not true, and you want to use reflection to search any two objects for all their properties and then what.. exclude the one with the name equal to "Id"? Is this your actual question? I wouldn't go down the reflection path as it's usually too slow. – Bizhan Apr 29 '21 at 09:42
  • @Chatwa If you need to compare any two objects with arbitrary comparison logic it's best to just implement IEquatable on all your classes and implement your custom comparison logic there. as JonasH mentioned – Bizhan Apr 29 '21 at 09:45
  • I am bit confused. Could you please help about the implementation? – Chatwa Apr 29 '21 at 09:50
  • 1
    Maybe something similar like this: https://stackoverflow.com/a/23623976/15760192? Just to avoid the reflection – Chatwa Apr 29 '21 at 09:57
1

You are probably lacking a equalitycomparer.

Since Category is a class without an equals method it will default to reference equality. There are several ways to define equality of objects.

  1. Create a new class, implementing IEqualityComparer<Category>, use this class as input to your Distinct call. This allow multiple different ways to compare the same type.
  2. Let Category Implement IEquatable<Category>
  3. Override Equals(object) method.
JonasH
  • 28,608
  • 2
  • 10
  • 23
  • How can I implement it in a generic way? I want to exclude the `Id` property though. – Chatwa Apr 29 '21 at 09:16
  • You can make a `KeyEqualityComparer` that takes a `Func< T, TKey>` that selects a property to compare. In this case you might be able to create one that takes a `Func and compares all the strings for equality. But there is a risk that the generic solution will end up much more complicated than just a regular implementation, so you might want to just go with the example Bizhan provided. – JonasH Apr 29 '21 at 09:39
  • I want to compare all properties except the `Id` one. – Chatwa Apr 29 '21 at 09:44
  • Maybe something similar like this? – Chatwa Apr 29 '21 at 09:57
  • @Chatwa There is no magic "compare everything except the Id" function. You could perhaps write something like that using reflection, but it would be horribly complicated and probably also fragile. Assuming you are referring to [this](https://stackoverflow.com/questions/6694508/how-to-use-the-iequalitycomparer/23623976#23623976) it is the same concept as "`KeyEqualityComparer`", but it will not work for your case since you have several properties to compare. – JonasH Apr 29 '21 at 11:24