44

How can I using c# and Linq to get a result from the next list:

 var pr = new List<Product>()
   {
       new Product() {Title="Boots",Color="Red",    Price=1},
       new Product() {Title="Boots",Color="Green",  Price=1},
       new Product() {Title="Boots",Color="Black",  Price=2},

       new Product() {Title="Sword",Color="Gray", Price=2},
       new Product() {Title="Sword",Color="Green",Price=2}
   };

Result:

        {Title="Boots",Color="Red",  Price=1},               
        {Title="Boots",Color="Black",  Price=2},             
        {Title="Sword",Color="Gray", Price=2}

I know that I should use GroupBy or Distinct, but understand how to get what is needed

   List<Product> result = pr.GroupBy(g => g.Title, g.Price).ToList(); //not working
   List<Product> result  = pr.Distinct(...);

Please help

3 Answers3

100

It's groups by needed properties and select:

List<Product> result = pr.GroupBy(g => new { g.Title, g.Price })
                         .Select(g => g.First())
                         .ToList();
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
Ilya Sulimanov
  • 7,636
  • 6
  • 47
  • 68
5

While a new anonymous type will work, it might make more sense, be more readable, and consumable outside of your method to either create your own type or use a Tuple. (Other times it may simply suffice to use a delimited string: string.Format({0}.{1}, g.Title, g.Price))

List<Product> result = pr.GroupBy(g => new Tuple<string, decimal>(g.Title, g.Price))
                     .ToList();

List<Product> result = pr.GroupBy(g => new ProductTitlePriceGroupKey(g.Title, g.Price))
                     .ToList();

As for getting the result set you want, the provided answer suggests just returning the first, and perhaps that's OK for your purposes, but ideally you'd need to provide a means by which Color is aggregated or ignored.

For instance, perhaps you'd rather list the colors included, somehow:

List<Product> result = pr
                     .GroupBy(g => new Tuple<string, decimal>(g.Title, g.Price))
                     .Select(x => new Product()
                             { 
                                  Title = x.Key.Item1, 
                                  Price = x.Key.Item2,
                                  Color = string.Join(", ", x.Value.Select(y => y.Color) // "Red, Green"
                             })
                     .ToList();

In the case of a simple string property for color, it may make sense to simply concatenate them. If you had another entity there, or simply don't want to abstract away that information, perhaps it would be best to have another entity altogether that has a collection of that entity type. For instance, if you were grouping on title and color, you might want to show the average price, or a range of prices, where simply selecting the first of each group would prevent you from doing so.

List<ProductGroup> result = pr
                     .GroupBy(g => new Tuple<string, decimal>(g.Title, g.Price))
                     .Select(x => new ProductGroup()
                             { 
                                  Title = x.Key.Item1, 
                                  Price = x.Key.Item2,
                                  Colors = x.Value.Select(y => y.Color)
                             })
                     .ToList();
JoeBrockhaus
  • 2,745
  • 2
  • 40
  • 64
  • Your use of Tuple for the group-by.....but then still creating a (first class).. "new ProductGroup()".. very nice trick. Thanks. – granadaCoder Aug 11 '21 at 20:03
0

If you want to abstract away some of the logic into a reusable extension method, you can add the following:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (!seenKeys.Contains(keySelector(element)))
        {
            seenKeys.Add(keySelector(element));
            yield return element;
        }
    }
}

This will work for both single properties and composite properties and return the first matching element

// distinct by single property
var productsByTitle = animals.DistinctBy(a => a.Title);

// distinct by multiple properties
var productsByTitleAndColor = animals.DistinctBy(a => new { a.Title, a.Color} );

One benefit to this approach (instead of group by + first) is you can return a yieldable enumerable in case you have later criteria that don't force you to loop through the entire collection

Further Reading: linq query to return distinct field values from a list of objects

KyleMit
  • 30,350
  • 66
  • 462
  • 664