0

I'm trying to refactor an old code "for-bubled" that I had to remove duplicates inside a collection of Items where if properties X Y and Z match the ones from a previously inserted Item, only the last item to be inserted should be preserved in the collection:

 private void RemoveDuplicates()
 {       
   //Remove duplicated items.       
   int endloop = Items.Count;
   for (int i = 0; i < endloop - 1; i++)
   {
     var item = Items[i];
     for (int j = i + 1; j < endloop; j++)
     {
      if (!item.HasSamePropertiesThan(Items[j]))
      {
        continue;
      }

      AllItems.Remove(item);
      break;
     }
   }       
 }

where HasSameProperties() is an extension method for Item and does something similar to:

public static bool HasSamePropertiesThan(this Item i1, Item i2)
{
  return string.Equals(i1.X, i2.X, StringComparison.InvariantCulture)
  && string.Equals(i1.Y, i2.Y, StringComparison.InvariantCulture)
  string.Equals(i1.Z, i2.Z, StringComparison.InvariantCulture);
}

so if I have a collection like:

[0]A
[1]A
[2]A
[3]B
[4]A
[5]A

I want to be able to delete all duplicates, leaving only [3]B and [5]A alive.

so far, I've managed to craft these lambdas:

var query = items.GroupBy(i => new {i.X, i.Y, i.Z}).Select(i => i.Last());  // Retrieves entities to not delete
        var dupes = Items.Except(query);
        dupes.ToList().ForEach(d => Items.Remove(d));

based on these examples:

Remove duplicates in the list using linq

Delete duplicates using Lambda

Which don't seem to work quite well... (The removed items are incorrect, some items are left in the collection and should've been removed) what am I doing wrong?

Cœur
  • 37,241
  • 25
  • 195
  • 267
safejrz
  • 544
  • 1
  • 14
  • 26

2 Answers2

2

mmm a quick question? the result of "Query" it supose to have the result that you are looking for? in my opinión you are getting a list of the ítems, then you do a query with the elements founded before and at the end you are removing from the original list the result

correct me if I'm wrong but is not the same doing something like this:

items = items.GroupBy(i => new {i.X, i.Y, i.Z}).Select(i => i.Last()).ToList();

if the result of "Query" is not returning the right elements then your problem is how are yo doing the query, or problably you need to order the list before apply the query

Gabe
  • 56
  • 3
0

You could either use a HashSet, or using linq do something like this:

var dups = new string[]{"A","A","B","B"};
var nonDupe = dups.Distinct().ToArray();
Zach Spencer
  • 1,859
  • 15
  • 21
  • There are some properties for 'Item' i'm not showing in the example (so technically [0]A is not completely equal to [1]A ) because I depend in the order of appearance in the collection to distinguish between the 'outdated' objects from the 'new' ones. That said, I think Distinct() would leave [0]A and [1]A (at least in my example) since they're not completely equal and afaik Distinct() takes the first Item that matches the selection criteria and ignores the rest items in the collection (which is kind of the opposite I want, the last matching Item for that criteria). Right? – safejrz Jul 09 '14 at 20:40