Remove objects with a duplicate property from List

Question

I have a List of objects in C#. All of the objects contain a property ID. There are several objects that have the same ID property.

How can I trim the List (or make a new List) where there is only one object per ID property?

[Any additional duplicates are dropped out of the List]

score 222 · Accepted Answer · edited Sep 29 '20 at 09:39

222

If you want to avoid using a third-party library, you could do something like:

var bar = fooArray.GroupBy(x => x.Id).Select(x => x.First()).ToList();

That will group the array by the Id property, then select the first entry in the grouping.

edited Sep 29 '20 at 09:39

Daniel Lord

754
5
18

answered Apr 03 '12 at 12:25

Daniel Mann

57,011
13
100
120

10

This worked perfectly here is my implementation: List uniqueRows = inputRows.GroupBy(x => x.Id).Select(x => x.First()).ToList(); – Baxter Apr 03 '12 at 13:24
6

Glad to help! One note: The `` on your `ToList()` is redundant. You should be able to just do `.ToList()` – Daniel Mann Apr 03 '12 at 13:43
1

You are right it works with just ToList() instead of ToList() – Baxter Apr 03 '12 at 16:20
a good alternatif than trying to figure out why using distinct and iquatable not working. – Ariwibawa Nov 19 '20 at 04:55

score 35 · Answer 2 · edited Nov 30 '18 at 14:26

35

MoreLINQ DistinctBy() will do the job, it allows using object proeprty for the distinctness. Unfortunatly built in LINQ Distinct() not flexible enoght.

var uniqueItems = allItems.DistinctBy(i => i.Id);

DistinctBy()

Returns all distinct elements of the given source, where "distinctness" is determined via a projection and the default eqaulity comparer for the projected type.

PS: Credits to Jon Skeet for sharing this library with community

edited Nov 30 '18 at 14:26

Kols

3,641
2
34
42

answered Apr 03 '12 at 12:26

sll

61,540
22
104
156

1

I think this is a great solution but am trying to avoid using a 3rd party library for this. Thank You. – Baxter Apr 03 '12 at 13:26
3

Fortunately you can see how it is implemented – sll Apr 03 '12 at 13:40

Theodor Zoulias · Answer 3 · 2023-01-18T14:18:03.157

Starting from .NET 6, a new DistinctBy LINQ operator is available:

public static IEnumerable<TSource> DistinctBy<TSource,TKey> (
    this IEnumerable<TSource> source,
    Func<TSource,TKey> keySelector);

Returns distinct elements from a sequence according to a specified key selector function.

Usage example:

List<Item> distinctList = listWithDuplicates
    .DistinctBy(i => i.Id)
    .ToList();

There is also an overload that has an IEqualityComparer<TKey> parameter.

Update in-place: In case creating a new List<T> is not desirable, here is a RemoveDuplicates extension method for the List<T> class:

/// <summary>
/// Removes all the elements that are duplicates of previous elements,
/// according to a specified key selector function.
/// </summary>
/// <returns>
/// The number of elements removed.
/// </returns>
public static int RemoveDuplicates<TSource, TKey>(
    this List<TSource> source,
    Func<TSource, TKey> keySelector,
    IEqualityComparer<TKey> keyComparer = null)
{
    ArgumentNullException.ThrowIfNull(source);
    ArgumentNullException.ThrowIfNull(keySelector);
    HashSet<TKey> hashSet = new(keyComparer);
    return source.RemoveAll(item => !hashSet.Add(keySelector(item)));
}

This method is efficient (O(n)) but also a bit dangerous, because it is based on the potentially corruptive List<T>.RemoveAll method¹. In case the keySelector lambda succeeds for some elements and then fails for another element, the partially modified List<T> will neither be restored to its initial state, nor it will be in a state recognizable as the result of successful individual Removes. Instead it will transition to a corrupted state that includes duplicate occurrences of existing elements. So in case the keySelector lambda is not fail-proof, the RemoveDuplicates method should be invoked in a try block that has a catch block where the potentially corrupted list is discarded.

Alternatively you could substitute the dangerous built-in RemoveAll with a safe custom implementation, that offers predictable behavior.

¹ _{For all .NET versions and platforms, including the latest .NET 7. I have submitted a proposal on GitHub to document the corruptive behavior of the List<T>.RemoveAll method, and the feedback that I received was that neither the behavior should be documented, nor the implementation should be fixed.}

score 7 · Answer 4 · edited Nov 30 '18 at 10:22

var list = GetListFromSomeWhere();
var list2 = GetListFromSomeWhere();
list.AddRange(list2);

....
...
var distinctedList = list.DistinctBy(x => x.ID).ToList();

More LINQ at GitHub

Or if you don't want to use external dlls for some reason, You can use this Distinct overload:

public static IEnumerable<TSource> Distinct<TSource>(
    this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)

Usage:

public class FooComparer : IEqualityComparer<Foo>
{
    // Products are equal if their names and product numbers are equal.
    public bool Equals(Foo x, Foo y)
    {

        //Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) return true;

        //Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        return x.ID == y.ID
    }
}



list.Distinct(new FooComparer());

score 3 · Answer 5 · edited Jul 28 '20 at 15:29

Not sure if anyone is still looking for any additional ways to do this. But I've used this code to remove duplicates from a list of User objects based on matching ID numbers.

private ArrayList RemoveSearchDuplicates(ArrayList SearchResults)
{
    ArrayList TempList = new ArrayList();

    foreach (User u1 in SearchResults)
    {
        bool duplicatefound = false;
        foreach (User u2 in TempList)
            if (u1.ID == u2.ID)
                duplicatefound = true;

        if (!duplicatefound)
            TempList.Add(u1);
    }
    return TempList;
}

Call: SearchResults = RemoveSearchDuplicates(SearchResults);

This is pointlessly O(n ^2) when regular GroupBy is just O(n)... — Alexei Levenkov, Nov 17 '20 at 21:42

Remove objects with a duplicate property from List

5 Answers5

Linked

Related