105

I have a List of objects in C#. All of the objects contain a property ID. There are several objects that have the same ID property.

How can I trim the List (or make a new List) where there is only one object per ID property?

[Any additional duplicates are dropped out of the List]

Kara
  • 6,115
  • 16
  • 50
  • 57
Baxter
  • 5,633
  • 24
  • 69
  • 105

5 Answers5

222

If you want to avoid using a third-party library, you could do something like:

var bar = fooArray.GroupBy(x => x.Id).Select(x => x.First()).ToList();

That will group the array by the Id property, then select the first entry in the grouping.

Daniel Lord
  • 754
  • 5
  • 18
Daniel Mann
  • 57,011
  • 13
  • 100
  • 120
35

MoreLINQ DistinctBy() will do the job, it allows using object proeprty for the distinctness. Unfortunatly built in LINQ Distinct() not flexible enoght.

var uniqueItems = allItems.DistinctBy(i => i.Id);

DistinctBy()

Returns all distinct elements of the given source, where "distinctness" is determined via a projection and the default eqaulity comparer for the projected type.

PS: Credits to Jon Skeet for sharing this library with community

Kols
  • 3,641
  • 2
  • 34
  • 42
sll
  • 61,540
  • 22
  • 104
  • 156
11

Starting from .NET 6, a new DistinctBy LINQ operator is available:

public static IEnumerable<TSource> DistinctBy<TSource,TKey> (
    this IEnumerable<TSource> source,
    Func<TSource,TKey> keySelector);

Returns distinct elements from a sequence according to a specified key selector function.

Usage example:

List<Item> distinctList = listWithDuplicates
    .DistinctBy(i => i.Id)
    .ToList();

There is also an overload that has an IEqualityComparer<TKey> parameter.


Update in-place: In case creating a new List<T> is not desirable, here is a RemoveDuplicates extension method for the List<T> class:

/// <summary>
/// Removes all the elements that are duplicates of previous elements,
/// according to a specified key selector function.
/// </summary>
/// <returns>
/// The number of elements removed.
/// </returns>
public static int RemoveDuplicates<TSource, TKey>(
    this List<TSource> source,
    Func<TSource, TKey> keySelector,
    IEqualityComparer<TKey> keyComparer = null)
{
    ArgumentNullException.ThrowIfNull(source);
    ArgumentNullException.ThrowIfNull(keySelector);
    HashSet<TKey> hashSet = new(keyComparer);
    return source.RemoveAll(item => !hashSet.Add(keySelector(item)));
}

This method is efficient (O(n)) but also a bit dangerous, because it is based on the potentially corruptive List<T>.RemoveAll method¹. In case the keySelector lambda succeeds for some elements and then fails for another element, the partially modified List<T> will neither be restored to its initial state, nor it will be in a state recognizable as the result of successful individual Removes. Instead it will transition to a corrupted state that includes duplicate occurrences of existing elements. So in case the keySelector lambda is not fail-proof, the RemoveDuplicates method should be invoked in a try block that has a catch block where the potentially corrupted list is discarded.

Alternatively you could substitute the dangerous built-in RemoveAll with a safe custom implementation, that offers predictable behavior.

¹ For all .NET versions and platforms, including the latest .NET 7. I have submitted a proposal on GitHub to document the corruptive behavior of the List<T>.RemoveAll method, and the feedback that I received was that neither the behavior should be documented, nor the implementation should be fixed.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
7
var list = GetListFromSomeWhere();
var list2 = GetListFromSomeWhere();
list.AddRange(list2);

....
...
var distinctedList = list.DistinctBy(x => x.ID).ToList();

More LINQ at GitHub

Or if you don't want to use external dlls for some reason, You can use this Distinct overload:

public static IEnumerable<TSource> Distinct<TSource>(
    this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)

Usage:

public class FooComparer : IEqualityComparer<Foo>
{
    // Products are equal if their names and product numbers are equal.
    public bool Equals(Foo x, Foo y)
    {

        //Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) return true;

        //Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        return x.ID == y.ID
    }
}



list.Distinct(new FooComparer());
Kols
  • 3,641
  • 2
  • 34
  • 42
gdoron
  • 147,333
  • 58
  • 291
  • 367
3

Not sure if anyone is still looking for any additional ways to do this. But I've used this code to remove duplicates from a list of User objects based on matching ID numbers.

private ArrayList RemoveSearchDuplicates(ArrayList SearchResults)
{
    ArrayList TempList = new ArrayList();

    foreach (User u1 in SearchResults)
    {
        bool duplicatefound = false;
        foreach (User u2 in TempList)
            if (u1.ID == u2.ID)
                duplicatefound = true;

        if (!duplicatefound)
            TempList.Add(u1);
    }
    return TempList;
}

Call: SearchResults = RemoveSearchDuplicates(SearchResults);

Nikita Popov
  • 896
  • 10
  • 19
JScott
  • 55
  • 1
  • 1