Remove duplicate items from list in C#, according to one of their properties

Question

I have list of class of type:

public class MyClass
{        
    public SomeOtherClass classObj;         
    public string BillId;           
}

public List<MyClass> myClassObject;

Sample Values:

BillId = "123",classObj = {},
BillId = "999",classObj = {},
BillId = "777",classObj = {},
BillId = "123",classObj = {}

So in above example, we have duplicate values for BillId. I would like to remove all the duplicate values (Not Distinct) so the result would contain only 999 & 777 value.

One way to achieve this to

Loop through all items
Get count of unique BillId
If count is greater than 1, store that BillId in another variable
Loop again and remove item based on BillId

Is there any straightforward way to achieve this?

You want to remove all pairs that have duplicate values? In your case, you want to remove both 123's? — Joe Phillips, Jun 21 '17 at 15:50
Related: [Remove List elements that appear more than once, in place](https://stackoverflow.com/questions/36415957/remove-listt-elements-that-appear-more-than-once-in-place) — Theodor Zoulias, Jan 17 '23 at 15:21
I edited the title, to differentiate this question from [a similar one](https://stackoverflow.com/questions/292307/selecting-unique-elements-from-a-list-in-c-sharp). — Theodor Zoulias, Jan 18 '23 at 07:53

maccettura · Accepted Answer · 2018-11-21T05:26:33.937

16

I think this would work:

var result = myClassObject.GroupBy(x => x.BillId)
    .Where(x => x.Count() == 1)
    .Select(x => x.First());

Fiddle here

edited Nov 21 '18 at 05:26

answered Jun 21 '17 at 15:51

maccettura

10,514
3
28
35

Yes, it returns a list of bool. – IvanJazz Jun 21 '17 at 16:00
1

Yep, after a quick test I can say this should be the accepted answer. – FireSarge Jun 21 '17 at 16:03

Sajeetharan · Answer 2 · 2017-06-21T16:51:11.430

4

You can also do this,

var result = myClassObject.GroupBy(x => x.BillId)
              .Where(x => !x.Skip(1).Any())
              .Select(x => x.First());

FIDDLE

edited Jun 21 '17 at 16:51

answered Jun 21 '17 at 15:48

Sajeetharan

216,225
63
350
396

4

Wouldn't this also include `123`? – jdmdevdotnet Jun 21 '17 at 15:49
1

`I would like to remove all the duplicate values (Not Distinct) so the result would contain only 999 & 777 value` I think he wants to remove duplicated completely from the list. Correct me if I'm wrong, but this would return 123. – jdmdevdotnet Jun 21 '17 at 15:49
He clearly stated "No distinct". This is another way to achieve the Distinct() effect in Linq. – FireSarge Jun 21 '17 at 15:51
Please don't answer duplicates, especially obvious duplicates like this one. – Heretic Monkey Jun 21 '17 at 15:51
@MikeMcCaughan Why? It's nice to have a variety of different answers. A lot of the chosen answer can be pretty horrible at times – Joe Phillips Jun 21 '17 at 15:55
@Joe Then you should post a new answer on the duplicate. If it is indeed of higher quality, it will bubble up to the top. – Heretic Monkey Jun 21 '17 at 15:57
1

@Sajeetharan You're being downvoted because both of your answers are still wrong – Joe Phillips Jun 21 '17 at 16:03
@JoePhillips yes ! but i dont understand even the above answer is wrong before few minutes ! – Sajeetharan Jun 21 '17 at 16:05
2

@Sajeetharan At this point you might as well delete your answer unless you have a different solution than the person with 10+ upvotes – Joe Phillips Jun 21 '17 at 16:06
@JoePhillipsi have posted a different answer now – Sajeetharan Jun 21 '17 at 16:16
Your answer is still wrong, It is going to return a `IEnumerable` when it should be returning a `IEnumerable` – Scott Chamberlain Jun 21 '17 at 16:48
1

Now that you fixed your errors your answer is the best one IMHO because doing a `!x.Skip(1).Any()` check takes less work than a `x.Count() == 1` – Scott Chamberlain Jun 21 '17 at 16:55
@ScottChamberlain actually since the underlining data type is a List, the Count() extension method is smart enough to check the List.Count property. I’m on mobile so I can’t link the source but you can check it yourself. – maccettura Nov 21 '18 at 05:27

score 1 · Answer 3 · edited Jun 22 '17 at 07:55

1

This may help.

var result = myClassObject
          .GroupBy(x => x.BillId)
          .Where(x => x.Count()==1)
          .Select(x => x.FirstOrDefault());

edited Jun 22 '17 at 07:55

jAC

5,195
6
40
55

answered Jun 22 '17 at 05:31

sina_Islam

1,068
12
19

score 1 · Answer 4 · edited Jul 26 '19 at 17:36

1

The .Where(x => x.Count()==1) wasn't good for me.

You can try:

.GroupBy(x => x.codeLigne).Select(x => x.First()).ToList()

edited Jul 26 '19 at 17:36

Kaiser

1,957
1
19
28

answered Jul 26 '19 at 15:24

Ivan Magnin-oddos

29
3

*"The `.Where(x => x.Count()==1)` wasn't good for me."* -- Then you are interested for the [`DistinctBy`](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.distinctby) functionality. This question is about removing duplicates altogether, without preserving any of them. Your answer would be valid in [this question](https://stackoverflow.com/questions/489258/linqs-distinct-on-a-particular-property "LINQ's Distinct() on a particular property"). – Theodor Zoulias Jan 18 '23 at 09:19

score 0 · Answer 5 · answered Jun 21 '17 at 16:02

0

Try this.

var distinctList = myClassObject.GroupBy(m => m.BillId)
                                .Where(x => x.Count() == 1)
                                .SelectMany(x => x.ToList())
                                .ToList();

answered Jun 21 '17 at 16:02

IvanJazz

763
1
7
19

Theodor Zoulias · Answer 6 · 2023-01-18T12:57:29.317

You've asked for a straightforward solution to the problem, and the GroupBy+Where+Select solutions satisfy perfectly this requirement, but you might also be interested for a highly performant and memory-efficient solution. Below is an implementation that uses all the tools that are currently available (.NET 6+) for maximum efficiency:

/// <summary>
/// Returns a sequence of elements that appear exactly once in the source sequence,
/// according to a specified key selector function.
/// </summary>
public static IEnumerable<TSource> UniqueBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IEqualityComparer<TKey> comparer = default)
{
    ArgumentNullException.ThrowIfNull(source);
    ArgumentNullException.ThrowIfNull(keySelector);

    Dictionary<TKey, (TSource Item, bool Unique)> dictionary = new(comparer);
    if (source.TryGetNonEnumeratedCount(out int count))
        dictionary.EnsureCapacity(count); // Assume that most items are unique

    foreach (TSource item in source)
        CollectionsMarshal.GetValueRefOrAddDefault(dictionary, keySelector(item),
            out bool exists) = exists ? default : (item, true);

    foreach ((TSource item, bool unique) in dictionary.Values)
        if (unique)
            yield return item;
}

The TryGetNonEnumeratedCount+EnsureCapacity combination can have a significant impact on the amount of memory allocated during the enumeration of the source, in case the source is a type with well known size, like a List<T>.

The CollectionsMarshal.GetValueRefOrAddDefault ensures that each key will be hashed only once, which can be impactful in case the keys have expensive GetHashCode implementations.

Usage example:

List<MyClass> unique = myClassObject.UniqueBy(x => x.BillId).ToList();

Online demo.

The difference of the above UniqueBy from the built-in DistinctBy LINQ operator, is that the former eliminates completely the duplicates altogether, while the later preserves the first instance of each duplicate element.

Remove duplicate items from list in C#, according to one of their properties

6 Answers6

Linked

Related