2

Similar to remove duplicate items from list in c#

I want to create a list then if a list item appears more than once, only treat it as one item, not duplicating it in the list and not ignoring it either.

Using the example from the ticket above: https://dotnetfiddle.net/NPqzne

List<MyClass> list = new List<MyClass>();

list.Add(new MyClass() { BillId = "123", classObj = {} });
list.Add(new MyClass() { BillId = "777", classObj = {} });
list.Add(new MyClass() { BillId = "999", classObj = {} });
list.Add(new MyClass() { BillId = "123", classObj = {} });

var result = myClassObject.GroupBy(x => x.BillId)
    .Where(x => x.Count() == 1)
    .Select(x => x.First());

Console.WriteLine(string.Join(", ", result.Select(x => x.BillId)));

How would I change that so results are

123, 777, 999 

rather than ignoring 123 altogether because it's a duplicate?

Vivek Nuna
  • 25,472
  • 25
  • 109
  • 197
Elsie
  • 21
  • 1
  • I don't see how "not duplicating it in the list" and "not ignoring it either" can be combined. How do you picture "not ignoring it" while it's not added to the list? – Gert Arnold Dec 23 '22 at 22:08

5 Answers5

1

you can modify to these lines in your code, I have tried with your dotnetfiddle code. its working as expected.

var result = list.Select(x => x.BillId).Distinct();
Console.WriteLine(string.Join(", ", result.Select(x => x)));

You need to use Distinct to get the unique values.

Thank you for providing dotnetfiddle link, it helped in writing code easily.

Vivek Nuna
  • 25,472
  • 25
  • 109
  • 197
0

Starting from .Net 6 you can try DistinctBy:

var result = myClassObject
  .DistinctBy(x => x.BillId)
  .ToList();

On older versions you can modify your current GroupBy solution (your don't want filtering .Where(x => x.Count() == 1) - we are not ignoring duplicatesm which have Count() > 1):

var result = myClassObject
  .GroupBy(x => x.BillId)
  .Select(x => x.First())
  .ToList();

Finally, no Linq solution with a help of HashSet<string>:

var result = new List<myClassObject>();

var unique = new HashSet<string>();

foreach (var item in myClassObject)
  if (unique.Add(item.BillId))
    result.Add(item);
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
0

You could use a Dictionary or HashSet instead of List, since these collection types don't allow duplicates:

Dictionary<string, MyClass> dict = new Dictionary<string, MyClass>(); 
dict.Add(123, new MyClass() { BillId = "123", classObj = {} }); 
dict.Add(777, new MyClass() { BillId = "777", classObj = {} }); 
dict.Add(999, new MyClass() { BillId = "999", classObj = {} }); 
dict.Add(123, new MyClass() { BillId = "123", classObj = {} });  // this 
//will not be added as the key is already present in the dictionary 

var result = dict.Select(x => x.Value); 
Console.WriteLine(string.Join(", ", result.Select(x => x.BillId))); //123,777, 999
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
Abbas
  • 1
  • 1
0

There is an easy and standard way of preventing duplicates to be added to the list: use a HashSet and a custom IEqualityComparer.

The equality comparer should see MyClass object with the same BillId as equal. Using that specification, this comparer is generated (by Resharper):

sealed class BillEqualityComparer : IEqualityComparer<MyClass>
{
    public bool Equals(MyClass x, MyClass y)
    {
        if (ReferenceEquals(x, y)) return true;
        if (ReferenceEquals(x, null)) return false;
        if (ReferenceEquals(y, null)) return false;
        if (x.GetType() != y.GetType()) return false;
        return x.BillId == y.BillId;
    }

    public int GetHashCode(MyClass obj)
    {
        return obj.BillId.GetHashCode();
    }
}

Now the code only needs a slight modification:

HashSet<MyClass> hashSet = new HashSet<MyClass>(new BillEqualityComparer());

hashSet.Add(new MyClass() { BillId = "123", classObj = { } });
hashSet.Add(new MyClass() { BillId = "777", classObj = { } });
hashSet.Add(new MyClass() { BillId = "999", classObj = { } });
hashSet.Add(new MyClass() { BillId = "123", classObj = { } });

And you'll see that the last object isn't added (the output of the Add method is false).

I don't really see what "not duplicating it in the list and not ignoring it either" means in your view. You could check the output of hashSet.Add and, when false, do something with the ignored item.

Gert Arnold
  • 105,341
  • 31
  • 202
  • 291
0

To deduplicate a collection of instances of an arbitrary class, you first need to define what it means for two instances to be equal: that is, to duplicate one another. That's easy for simple data types like integers. It's a little harder for strings, because case-insensitivity is part of the picture.

For your arbitrary class, you make it implement the IEquatable interface. Once you have done that, you can make a HashSet of your instances. The process of inserting instances into that HashSet will remove the dupes.

Add this to your class definition to declare that it implements IEquatable.

public class MyClass : IEquatable<MyClass> {

Then implement an Equals method and some other methods in your class to implement the IEquatable interface. An example is here. VS has helpful features to assist you implementing all the methods you need.

If you need to be able to sort your instances, you can implement IComparable as well, then sort operations will work.

It's hard to give more specific advice about implementing those interfaces because you didn't describe MyClass in your question.

O. Jones
  • 103,626
  • 17
  • 118
  • 172