1

I am currently trying to find the cleanest, most performant way to remove items from a list which have duplicate values specified for one of the properties of my list of objects. See the example below:

  public class MyModel
  {
    public string PropertyA { get; set; }

    public string PropertyB  { get; set; }

    public string PropertyC { get; set; }
  }

Now, let's say I have List<MyModel> models which contains potentially thousands of entries. I would like to be able to remove all entries except one(the first) in which PropertyB is the same.

The only way I've thought of doing this seems to be pretty taxing on performance and I'd like to find a different way- current idea is below:

List<MyModel> models = //initialized externally, contains thousands of records
List<MyModel> noDuplicatePropertyBs = new List<MyModel>();
List<string> propertyBs = new List<string>();

foreach(var model in models)
{
    if(!propertyBs.Contains(model.PropertyB))
    {
        noDuplicatePropertyBs.Add(model);
        propertyBs.Add(model.PropertyB);
    }
}

Edit: Note that I think I could override the base Equals method in my MyModel class to only compare PropertyB using the .Distinct() method however I already have an overridden Equals method that is necessary for many other parts of the project (and overriding the equals method for this purpose doesnt seem like a good idea since business logic wise, the objects need all 3 properties to be equal in order for the objects to be equal)

GregH
  • 5,125
  • 8
  • 55
  • 109
  • You should use LINQ. Checkout the documentation for more details. – h0r53 Jun 28 '17 at 18:24
  • Especially check out the accepted answer to the duplicate question. Just drop that method in your code (or add the MoreLinq NuGet), and you can do: `var noDuplicatePropertyBs = models.DistinctBy(m => m.B);` – Joel Coehoorn Jun 28 '17 at 18:25

2 Answers2

2

Make propertyBs a HashSet<string> instead of a List<string>, and use the return value of Add to make a decision:

var propertyBs = new HashSet<string>();
var res = models.Where(m => propertyBs.Add(m.PropertyB)).ToList();

When you call Add on hash set, only the addition of the initial item returns true; all duplicate additions would return false, so their corresponding models would be filtered out.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
0

I possible solution is using the Linq Group by feature. Take a look into this: Group by in LINQ

MNF
  • 82
  • 7