97

I have the following EF class derived from a database (simplified)

class Product
{ 
     public string ProductId;
     public string ProductName;
     public string CategoryId;
     public string CategoryName;
}

ProductId is the Primary Key of the table.

For a bad design decision made by the DB designer (I cannot modify it), I have CategoryId and CategoryName in this table.

I need a DropDownList with (distinct) CategoryId as Value and CategoryName as Text. Therefore I applied the following code:

product.Select(m => new {m.CategoryId, m.CategoryName}).Distinct();

which logically it should create an anonymous object with CategoryId and CategoryName as properties. The Distinct() guarantees that there are no duplicates pair (CategoryId, CategoryName).

But actually it does not work. As far as I understood the Distinct() works just when there is just one field in the collection otherwise it just ignores them...is it correct? Is there any workaround? Thanks!

UPDATE

Sorry product is:

List<Product> product = new List<Product>();

I found an alternative way to get the same result as Distinct():

product.GroupBy(d => new {d.CategoryId, d.CategoryName}) 
       .Select(m => new {m.Key.CategoryId, m.Key.CategoryName})
CiccioMiami
  • 8,028
  • 32
  • 90
  • 151
  • 'there is just one field in the collection' is nonsensical. What do you mean? – leppie May 23 '12 at 12:30
  • 1
    @leppie my guess is, he means when projecting to a single value, not an anonymous type (containing more than one field). – sehe May 23 '12 at 12:34
  • "For a bad design decision made by the DB designer (I cannot modify it)". You perhaps can't change the database, but this doesn't mean you can't fix this in your EF model. That's the beaty of EF. – Steven May 23 '12 at 12:36
  • Where doesn't it work ? Where are you in : classic asp.net, mvc ? What is "product" in product.Select ? – Raphaël Althaus May 23 '12 at 12:41
  • @leppie, sorry I meant "property in the collection" – CiccioMiami May 23 '12 at 12:43

10 Answers10

98

I assume that you use distinct like a method call on a list. You need to use the result of the query as datasource for your DropDownList, for example by materializing it via ToList.

var distinctCategories = product
                        .Select(m => new {m.CategoryId, m.CategoryName})
                        .Distinct()
                        .ToList();
DropDownList1.DataSource     = distinctCategories;
DropDownList1.DataTextField  = "CategoryName";
DropDownList1.DataValueField = "CategoryId";

Another way if you need the real objects instead of the anonymous type with only few properties is to use GroupBy with an anonymous type:

List<Product> distinctProductList = product
    .GroupBy(m => new {m.CategoryId, m.CategoryName})
    .Select(group => group.First())  // instead of First you can also apply your logic here what you want to take, for example an OrderBy
    .ToList();

A third option is to use MoreLinq's DistinctBy.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • 2
    @CiccioMiami upvoters, ensure you applying this answer only to anonymous types. Typed classes may require a delegate like [this one](http://stackoverflow.com/a/43010363/538763). – crokusek Mar 24 '17 at 22:51
  • Correct, the distinct will work only with Anonymous type – imdadhusen Jun 01 '17 at 12:16
  • 1
    @imdadhusen: not only with anonymous types, the type must override `Equals`+`GetHashCode` or you have to provide a custom `IEQualityCpomparer` in the overload of [`Distinct`](https://msdn.microsoft.com/en-us/library/bb338049(v=vs.110).aspx). – Tim Schmelter Jun 01 '17 at 12:22
  • @Tim, Can I use Where & Group By conditions together? In your second query or code, I just want to apply where condition before the group by, so Is it possible? – Md Aslam Jul 15 '20 at 13:47
17

Update for .NET 6 and above, DistinctBy was added:

myQueryable.DistinctBy(c => new { c.KeyA, c.KeyB});

https://learn.microsoft.com/en-us/dotnet/api/system.linq.queryable.distinctby?view=net-6.0

(for both IQueryable and IEnumerable)

sommmen
  • 6,570
  • 2
  • 30
  • 51
12

The Distinct() guarantees that there are no duplicates pair (CategoryId, CategoryName).

- exactly that

Anonymous types 'magically' implement Equals and GetHashcode

I assume another error somewhere. Case sensitivity? Mutable classes? Non-comparable fields?

sehe
  • 374,641
  • 47
  • 450
  • 633
6

This is my solution, it supports keySelectors of different types:

public static IEnumerable<TSource> DistinctBy<TSource>(this IEnumerable<TSource> source, params Func<TSource, object>[] keySelectors)
{
    // initialize the table
    var seenKeysTable = keySelectors.ToDictionary(x => x, x => new HashSet<object>());

    // loop through each element in source
    foreach (var element in source)
    {
        // initialize the flag to true
        var flag = true;

        // loop through each keySelector a
        foreach (var (keySelector, hashSet) in seenKeysTable)
        {                    
            // if all conditions are true
            flag = flag && hashSet.Add(keySelector(element));
        }

        // if no duplicate key was added to table, then yield the list element
        if (flag)
        {
            yield return element;
        }
    }
}

To use it:

list.DistinctBy(d => d.CategoryId, d => d.CategoryName)
Node.JS
  • 1,042
  • 6
  • 44
  • 114
5

Use the Key keyword in your select will work, like below.

product.Select(m => new {Key m.CategoryId, Key m.CategoryName}).Distinct();

I realize this is bringing up an old thread but figured it might help some people. I generally code in VB.NET when working with .NET so Key may translate differently into C#.

Michael Stramel
  • 1,337
  • 1
  • 16
  • 18
  • 5
    For doing a distinct on multiple fields in VB.Net, one has to use the 'Key' keywords for every property in the anonymous type. (Otherwise, the hashcode used for comparison will not get properly calculated) In C#, there is no 'Key' keyword - instead all properties in anonymous types are automatically Key fields. – voon Jun 26 '17 at 22:48
4

Distinct method returns distinct elements from a sequence.

If you take a look on its implementation with Reflector, you'll see that it creates DistinctIterator for your anonymous type. Distinct iterator adds elements to Set when enumerating over collection. This enumerator skips all elements which are already in Set. Set uses GetHashCode and Equals methods for defining if element already exists in Set.

How GetHashCode and Equals implemented for anonymous type? As it stated on msdn:

Equals and GetHashCode methods on anonymous types are defined in terms of the Equals and GetHashcode methods of the properties, two instances of the same anonymous type are equal only if all their properties are equal.

So, you definitely should have distinct anonymous objects, when iterating on distinct collection. And result does not depend on how many fields you use for your anonymous type.

Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
4

Answering the headline of the question (what attracted people here) and ignoring that the example used anonymous types....

This solution will also work for non-anonymous types. It should not be needed for anonymous types.

Helper class:

/// <summary>
/// Allow IEqualityComparer to be configured within a lambda expression.
/// From https://stackoverflow.com/questions/98033/wrap-a-delegate-in-an-iequalitycomparer
/// </summary>
/// <typeparam name="T"></typeparam>
public class LambdaEqualityComparer<T> : IEqualityComparer<T>
{
    readonly Func<T, T, bool> _comparer;
    readonly Func<T, int> _hash;

    /// <summary>
    /// Simplest constructor, provide a conversion to string for type T to use as a comparison key (GetHashCode() and Equals().
    /// https://stackoverflow.com/questions/98033/wrap-a-delegate-in-an-iequalitycomparer, user "orip"
    /// </summary>
    /// <param name="toString"></param>
    public LambdaEqualityComparer(Func<T, string> toString)
        : this((t1, t2) => toString(t1) == toString(t2), t => toString(t).GetHashCode())
    {
    }

    /// <summary>
    /// Constructor.  Assumes T.GetHashCode() is accurate.
    /// </summary>
    /// <param name="comparer"></param>
    public LambdaEqualityComparer(Func<T, T, bool> comparer)
        : this(comparer, t => t.GetHashCode())
    {
    }

    /// <summary>
    /// Constructor, provide a equality comparer and a hash.
    /// </summary>
    /// <param name="comparer"></param>
    /// <param name="hash"></param>
    public LambdaEqualityComparer(Func<T, T, bool> comparer, Func<T, int> hash)
    {
        _comparer = comparer;
        _hash = hash;
    }

    public bool Equals(T x, T y)
    {
        return _comparer(x, y);
    }

    public int GetHashCode(T obj)
    {
        return _hash(obj);
    }    
}

Simplest usage:

List<Product> products = duplicatedProducts.Distinct(
    new LambdaEqualityComparer<Product>(p =>
        String.Format("{0}{1}{2}{3}",
            p.ProductId,
            p.ProductName,
            p.CategoryId,
            p.CategoryName))
        ).ToList();

The simplest (but not that efficient) usage is to map to a string representation so that custom hashing is avoided. Equal strings already have equal hash codes.

Reference:
Wrap a delegate in an IEqualityComparer

Community
  • 1
  • 1
crokusek
  • 5,345
  • 3
  • 43
  • 61
1
public List<ItemCustom2> GetBrandListByCat(int id)
    {

        var OBJ = (from a in db.Items
                   join b in db.Brands on a.BrandId equals b.Id into abc1
                   where (a.ItemCategoryId == id)
                   from b in abc1.DefaultIfEmpty()
                   select new
                   {
                       ItemCategoryId = a.ItemCategoryId,
                       Brand_Name = b.Name,
                       Brand_Id = b.Id,
                       Brand_Pic = b.Pic,

                   }).Distinct();


        List<ItemCustom2> ob = new List<ItemCustom2>();
        foreach (var item in OBJ)
        {
            ItemCustom2 abc = new ItemCustom2();
            abc.CategoryId = item.ItemCategoryId;
            abc.BrandId = item.Brand_Id;
            abc.BrandName = item.Brand_Name;
            abc.BrandPic = item.Brand_Pic;
            ob.Add(abc);
        }
        return ob;

    }
0

the solution to your problem looks like this:

public class Category {
  public long CategoryId { get; set; }
  public string CategoryName { get; set; }
} 

...

public class CategoryEqualityComparer : IEqualityComparer<Category>
{
   public bool Equals(Category x, Category y)
     => x.CategoryId.Equals(y.CategoryId)
          && x.CategoryName .Equals(y.CategoryName, 
 StringComparison.OrdinalIgnoreCase);

   public int GetHashCode(Mapping obj)
     => obj == null 
         ? 0
         : obj.CategoryId.GetHashCode()
           ^ obj.CategoryName.GetHashCode();
}

...

 var distinctCategories = product
     .Select(_ => 
        new Category {
           CategoryId = _.CategoryId, 
           CategoryName = _.CategoryName
        })
     .Distinct(new CategoryEqualityComparer())
     .ToList();
Paul Alexeev
  • 172
  • 1
  • 11
-3
Employee emp1 = new Employee() { ID = 1, Name = "Narendra1", Salary = 11111, Experience = 3, Age = 30 };Employee emp2 = new Employee() { ID = 2, Name = "Narendra2", Salary = 21111, Experience = 10, Age = 38 };
Employee emp3 = new Employee() { ID = 3, Name = "Narendra3", Salary = 31111, Experience = 4, Age = 33 };
Employee emp4 = new Employee() { ID = 3, Name = "Narendra4", Salary = 41111, Experience = 7, Age = 33 };

List<Employee> lstEmployee = new List<Employee>();

lstEmployee.Add(emp1);
lstEmployee.Add(emp2);
lstEmployee.Add(emp3);
lstEmployee.Add(emp4);

var eemmppss=lstEmployee.Select(cc=>new {cc.ID,cc.Age}).Distinct();
Shell
  • 6,818
  • 11
  • 39
  • 70