22

Suppose you have a list of MyObject like this:

public class MyObject
{
  public int ObjectID {get;set;}
  public string Prop1 {get;set;}
}

How do you remove duplicates from a list where there could be multiple instance of objects with the same ObjectID.

Thanks.

frenchie
  • 51,731
  • 109
  • 304
  • 510
  • Why should there ever be objects with the same ID (usually that is kept distinct for exactly this reason...)? I suppose you could create a list of the IDs and stings (as a composite string) and then do a .Distinct() but I think there is something more fundamentally wrong if IDs are non-unique. – soandos May 11 '11 at 19:45
  • @soandos: He might have a way to get the same object in the list twice on accident. – Jay Sullivan May 11 '11 at 19:50
  • @notfed: That is still indicative of something going wrong. – soandos May 11 '11 at 19:51
  • But if that something going wrong is in an external source then removing duplicates is necessary. – dragoncmd May 12 '16 at 18:53

3 Answers3

50

You can use GroupBy() and select the first item of each group to achieve what you want - assuming you want to pick one item for each distinct ObjectId property:

var distinctList = myList.GroupBy(x => x.ObjectID)
                         .Select(g => g.First())
                         .ToList();

Alternatively there is also DistinctBy() in the MoreLinq project that would allow for a more concise syntax (but would add a dependency to your project):

var distinctList = myList.DistinctBy( x => x.ObjectID).ToList();
BrokenGlass
  • 158,293
  • 28
  • 286
  • 335
  • Ok, cool, this is it. Would myList = myList.... also work? Just looking to avoid creating a new list. – frenchie May 11 '11 at 19:51
  • @frenchie: Yes of course, you can reassign to the same list variable – BrokenGlass May 11 '11 at 19:53
  • You can not avoid creating a new list when using linq. If you want to use the same list you have to modify it yourself using `list.Remove(item)` or `list.RemoveAt(index)`. – Zebi May 11 '11 at 19:56
  • I tried this, but it is not working for my list of objects. I still get duplicates returned. Please see my question: http://stackoverflow.com/questions/42316343/groupby-to-remove-duplicates-from-ienumerable-list-of-objects – naz786 Feb 20 '17 at 13:41
  • Any downsides to using something like MoreLinq? – rollsch Mar 13 '17 at 07:04
  • nope - just the complexity cost of adding an external dependency – BrokenGlass Mar 13 '17 at 22:29
  • You can group with multiple properties with List MyUniqueList = MyList.GroupBy(x => new { x.Column1, x.Column2 }).Select(g=> g.First()).ToList(); – Sumit Joshi Oct 10 '17 at 07:25
  • Very clever, +1! – Rhurac Oct 17 '17 at 19:33
12

You can do this using the Distinct() method. But since that method uses the default equality comparer, your class needs to implement IEquatable<MyObject> like this:

public class MyObject : IEquatable<MyObject>
{
    public int ObjectID {get;set;}
    public string Prop1 {get;set;}

    public bool Equals(MyObject other)
    {
        if (other == null) return false;
        else return this.ObjectID.Equals(other.ObjectID); 
    }

    public override int GetHashCode()
    {
        return this.ObjectID.GetHashCode();
    }
}

Now you can use the Distinct() method:

List<MyObject> myList = new List<MyObject>();
myList.Add(new MyObject { ObjectID = 1, Prop1 = "Something" });
myList.Add(new MyObject { ObjectID = 2, Prop1 = "Another thing" });
myList.Add(new MyObject { ObjectID = 3, Prop1 = "Yet another thing" });
myList.Add(new MyObject { ObjectID = 1, Prop1 = "Something" });

var duplicatesRemoved = myList.Distinct().ToList();
Kristof Claes
  • 10,797
  • 3
  • 30
  • 42
  • That would make the code look so much elegant! and easy to read. :-) – Juan Gomez May 11 '11 at 19:54
  • Why is this better than the 3-line answer provided above? What does it do more? – frenchie May 11 '11 at 20:25
  • It enables your objects to be compared. This will be needed for some other list features, too and objects wich have an ID (entities) should be comparable by their id. – Zebi May 11 '11 at 21:15
  • 1
    @frenchie: I believe it is more elegant and readable than the `GroupBy` solution and unlike the `DistinctBy` solution, it works without adding extra libraries. – Kristof Claes May 12 '11 at 06:18
  • When I use this method and debug the list using a foreach loop, i still get duplicates. Please see my question: http://stackoverflow.com/questions/42316343/groupby-to-remove-duplicates-from-ienumerable-list-of-objects – naz786 Feb 20 '17 at 14:27
3

You could create a custom object comparer by implementing the IEqualityComparer interface:

public class MyObject
{
    public int Number { get; set; }
}

public class MyObjectComparer : IEqualityComparer<MyObject>
{
    public bool Equals(MyObject x, MyObject y)
    {
        return x.Id == y.Id;
    }

    public int GetHashCode(MyObject obj)
    {
        return obj.Id.GetHashCode();
    }
}

Then simply:

myList.Distinct(new MyObjectComparer()) 
DigitalNomad
  • 428
  • 1
  • 3
  • 18