384

I have a class Items with properties (Id, Name, Code, Price).

The List of Items is populated with duplicated items.

For ex.:

1         Item1       IT00001        $100
2         Item2       IT00002        $200
3         Item3       IT00003        $150
1         Item1       IT00001        $100
3         Item3       IT00003        $150

How to remove the duplicates in the list using linq?

Çağdaş Tekin
  • 16,592
  • 4
  • 49
  • 58
Prasad
  • 58,881
  • 64
  • 151
  • 199

11 Answers11

741
var distinctItems = items.GroupBy(x => x.Id).Select(y => y.First());
Chuck Norris
  • 15,207
  • 15
  • 92
  • 123
Freddy
  • 7,411
  • 1
  • 14
  • 2
455
var distinctItems = items.Distinct();

To match on only some of the properties, create a custom equality comparer, e.g.:

class DistinctItemComparer : IEqualityComparer<Item> {

    public bool Equals(Item x, Item y) {
        return x.Id == y.Id &&
            x.Name == y.Name &&
            x.Code == y.Code &&
            x.Price == y.Price;
    }

    public int GetHashCode(Item obj) {
        return obj.Id.GetHashCode() ^
            obj.Name.GetHashCode() ^
            obj.Code.GetHashCode() ^
            obj.Price.GetHashCode();
    }
}

Then use it like this:

var distinctItems = items.Distinct(new DistinctItemComparer());
Christian Hayter
  • 30,581
  • 6
  • 72
  • 99
  • Hi Christian , What will be the change in code if i have a List and List. My custom class has various items in which one is DCN number and list has only DCN number. So I need to check the List contains any dcn from List. For example suppose List1 = List and List2 = List. If List1 has 2000 items and list2 has 40000 items on which 600 items from List1 exists in List2. So in this case i need 1400 as my output List as list1. So what would be the expression. Thanks in advance –  Aug 11 '10 at 02:54
  • Also one more case is here since List1 contains various items , other items values might be different but the DCN must be same. So in my case Distinct failed to give desired out put. –  Aug 11 '10 at 02:57
  • 3
    I find comparer classes extremely useful. They can express logic other than simple property name comparisons. I wrote a new one last month, to do something that `GroupBy` could not. – Christian Hayter Aug 19 '13 at 07:22
  • Works well and got me to learn something new and investigate the `XoR` operator `^` in C#. Had used in VB.NET via `Xor` but had to do a double take to your code to see what it was at first. – atconway May 02 '14 at 15:51
  • This is the error I get when I try to use Distinct Comparer: "LINQ to Entities does not recognize the method 'System.Linq.IQueryable`1[DataAccess.HR.Dao.CCS_LOCATION_TBL] Distinct[CCS_LOCATION_TBL](System.Linq.IQueryable`1[DataAccess.HR.Dao.CCS_LOCATION_TBL], System.Collections.Generic.IEqualityComparer`1[DataAccess.HR.Dao.CCS_LOCATION_TBL])' method, and this method cannot be translated into a store expression. – user8128167 Nov 13 '15 at 16:38
  • OK had to add AsEnumerable, see http://stackoverflow.com/questions/19424227/error-on-iequalitycomparert – user8128167 Nov 13 '15 at 16:51
  • LINQ to Entities doesn't support methods that use IEqualityComparer. [Supported and Unsupported LINQ Methods (LINQ to Entities)](https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/ef/language-reference/supported-and-unsupported-linq-methods-linq-to-entities). – Suncat2000 Jun 24 '20 at 13:29
  • Stuff like this is what makes me love C# – Kellen Stuart Nov 23 '20 at 18:56
  • @KolobCanyon It is used in hash table lookups. There is a good write-up at https://stackoverflow.com/a/4096774/115413 – Christian Hayter Nov 23 '20 at 21:48
47

If there is something that is throwing off your Distinct query, you might want to look at MoreLinq and use the DistinctBy operator and select distinct objects by id.

var distinct = items.DistinctBy( i => i.Id );
tvanfosson
  • 524,688
  • 99
  • 697
  • 795
34

This is how I was able to group by with Linq. Hope it helps.

var query = collection.GroupBy(x => x.title).Select(y => y.FirstOrDefault());
sobelito
  • 1,525
  • 17
  • 13
Victor Juri
  • 885
  • 9
  • 5
  • 3
    @nawfal, I was suggesting FirstOrDefault() in lieu of First() – sobelito Jun 30 '14 at 19:21
  • 35
    If I am correct, using `FirstOrDefault` here offers no benefit if the `Select` immediately follows `GroupBy`, since there's no possibility of there being an empty group (the groups were _just derived_ from the collection's contents) – Roy Tinker Nov 18 '15 at 23:28
28

An universal extension method:

public static class EnumerableExtensions
{
    public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> enumerable, Func<T, TKey> keySelector)
    {
        return enumerable.GroupBy(keySelector).Select(grp => grp.First());
    }
}

Example of usage:

var lstDst = lst.DistinctBy(item => item.Key);
TOL
  • 460
  • 4
  • 9
20

You have three option here for removing duplicate item in your List:

  1. Use a a custom equality comparer and then use Distinct(new DistinctItemComparer()) as @Christian Hayter mentioned.
  2. Use GroupBy, but please note in GroupBy you should Group by all of the columns because if you just group by Id it doesn't remove duplicate items always. For example consider the following example:

    List<Item> a = new List<Item>
    {
        new Item {Id = 1, Name = "Item1", Code = "IT00001", Price = 100},
        new Item {Id = 2, Name = "Item2", Code = "IT00002", Price = 200},
        new Item {Id = 3, Name = "Item3", Code = "IT00003", Price = 150},
        new Item {Id = 1, Name = "Item1", Code = "IT00001", Price = 100},
        new Item {Id = 3, Name = "Item3", Code = "IT00003", Price = 150},
        new Item {Id = 3, Name = "Item3", Code = "IT00004", Price = 250}
    };
    var distinctItems = a.GroupBy(x => x.Id).Select(y => y.First());
    

    The result for this grouping will be:

    {Id = 1, Name = "Item1", Code = "IT00001", Price = 100}
    {Id = 2, Name = "Item2", Code = "IT00002", Price = 200}
    {Id = 3, Name = "Item3", Code = "IT00003", Price = 150}
    

    Which is incorrect because it considers {Id = 3, Name = "Item3", Code = "IT00004", Price = 250} as duplicate. So the correct query would be:

    var distinctItems = a.GroupBy(c => new { c.Id , c.Name , c.Code , c.Price})
                         .Select(c => c.First()).ToList();
    

    3.Override Equal and GetHashCode in item class:

    public class Item
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string Code { get; set; }
        public int Price { get; set; }
    
        public override bool Equals(object obj)
        {
            if (!(obj is Item))
                return false;
            Item p = (Item)obj;
            return (p.Id == Id && p.Name == Name && p.Code == Code && p.Price == Price);
        }
        public override int GetHashCode()
        {
            return String.Format("{0}|{1}|{2}|{3}", Id, Name, Code, Price).GetHashCode();
        }
    }
    

    Then you can use it like this:

    var distinctItems = a.Distinct();
    
Community
  • 1
  • 1
Salah Akbari
  • 39,330
  • 10
  • 79
  • 109
17

Use Distinct() but keep in mind that it uses the default equality comparer to compare values, so if you want anything beyond that you need to implement your own comparer.

Please see http://msdn.microsoft.com/en-us/library/bb348436.aspx for an example.

Brian Rasmussen
  • 114,645
  • 34
  • 221
  • 317
  • I should notice that default comparer works if collection member types is one of value types. But which default equality comparer select by csc for reference types. Reference types must have own comparer(s). – Nuri YILMAZ Mar 03 '17 at 19:20
6

Try this extension method out. Hopefully this could help.

public static class DistinctHelper
{
    public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
    {
        var identifiedKeys = new HashSet<TKey>();
        return source.Where(element => identifiedKeys.Add(keySelector(element)));
    }
}

Usage:

var outputList = sourceList.DistinctBy(x => x.TargetProperty);
Kent Aguilar
  • 5,048
  • 1
  • 33
  • 20
4
List<Employee> employees = new List<Employee>()
{
    new Employee{Id =1,Name="AAAAA"}
    , new Employee{Id =2,Name="BBBBB"}
    , new Employee{Id =3,Name="AAAAA"}
    , new Employee{Id =4,Name="CCCCC"}
    , new Employee{Id =5,Name="AAAAA"}
};

List<Employee> duplicateEmployees = employees.Except(employees.GroupBy(i => i.Name)
                                             .Select(ss => ss.FirstOrDefault()))
                                            .ToList();
Salah Akbari
  • 39,330
  • 10
  • 79
  • 109
Arun Kumar
  • 49
  • 1
0

Another workaround, not beautiful buy workable.

I have an XML file with an element called "MEMDES" with two attribute as "GRADE" and "SPD" to record the RAM module information. There are lot of dupelicate items in SPD.

So here is the code I use to remove the dupelicated items:

        IEnumerable<XElement> MList =
            from RAMList in PREF.Descendants("MEMDES")
            where (string)RAMList.Attribute("GRADE") == "DDR4"
            select RAMList;

        List<string> sellist = new List<string>();

        foreach (var MEMList in MList)
        {
            sellist.Add((string)MEMList.Attribute("SPD").Value);
        }

        foreach (string slist in sellist.Distinct())
        {
            comboBox1.Items.Add(slist);
        }
Rex Hsu
  • 73
  • 1
  • 5
-1

When you don't want to write IEqualityComparer you can try something like following.

 class Program
{

    private static void Main(string[] args)
    {

        var items = new List<Item>();
        items.Add(new Item {Id = 1, Name = "Item1"});
        items.Add(new Item {Id = 2, Name = "Item2"});
        items.Add(new Item {Id = 3, Name = "Item3"});

        //Duplicate item
        items.Add(new Item {Id = 4, Name = "Item4"});
        //Duplicate item
        items.Add(new Item {Id = 2, Name = "Item2"});

        items.Add(new Item {Id = 3, Name = "Item3"});

        var res = items.Select(i => new {i.Id, i.Name})
            .Distinct().Select(x => new Item {Id = x.Id, Name = x.Name}).ToList();

        // now res contains distinct records
    }



}


public class Item
{
    public int Id { get; set; }

    public string Name { get; set; }
}
Kundan Bhati
  • 485
  • 1
  • 9
  • 19