1

I have following code to remove group from collection. Technically, there should be no duplicates, but it does remove all anyway. Any trick with LINQ to .Remove.Where.. ?

public void DeleteGroup(KeyValuePair<int, string> group)
            {
                while (this.Groups.Any(g => g.Key.Equals(group.Key)))
                {
                    var groupToRemove = this.Groups.First(g => g.Key.Equals(group.Key));
                    this.Groups.Remove(groupToRemove);
                }

            }
katit
  • 17,375
  • 35
  • 128
  • 256

1 Answers1

2

Assuming you are passing in a KeyValuePair with the same Key and the same Value this is the most efficient way possible with an ObseravableCollection.

public void DeleteGroup2(KeyValuePair<int, string> group)
{
    Groups.Remove(group);
}

This works because a KeyValuePair is a structure and when the overloaded operator == is applied it is comparing both the Key and the Value data members of the structure.

Again this will work just fine if you pass in the exact same Key and Value that is contained in the Groups obserabableCollection...if the Value does not match it will not work.

Behind the scenes an ObserableCollection is pretty much a list so it will have to iterate over every item performing the == operator. The same is true for the code you are posting. Just because it is using LINQ doesn't mean it's any more efficient. It's not like the LINQ where clause is using any indexing like it would be with LINQ to SQL.

public void DeleteGroup3(KeyValuePair<int, string> groupToDelete)
{
    var itemsToDelete =
        (
            from g in Groups
            where g.Key == groupToDelete.Key
            select g
        );

    foreach (var kv in itemsToDelete)
    {
        Groups.Remove(kv);
    }
}

This would probably be most efficient method using linq if you want to guarantee that you remove all items even those with duplicate keys.

public void DeleteGroup4(KeyValuePair<int, string> group)
{
    List<int> keyIndexes = new List<int>();
    int maxIndex = Groups.Count;
    for (int i = 0; i < maxIndex; i++)
    {
        if (Groups[i].Key == group.Key)
        {
            keyIndexes.Add(i);
        }
    }

    int indexOffset = 0;
    foreach (int index in keyIndexes)
    {
        Groups.RemoveAt(index - indexOffset);
        indexOffset++;
    }
}

This should have the best performance of all of them if you have multiple items with the same key or you don't know the exact same Key Value pair as the original.

I believe your DeleteGroup method is BIG O of 2N^2...N for the outer Any while Loop and N for the First and N for the Remove. Take outer Loop times the sum of the inside and you get 2N^2

DeleteGroup2 is BIG O of N and had the best performance of all of them. The drawback is that you need to know both the Key and the Value not just the Key. It will also only remove the first item it finds. It won't delete duplicate items with the same Key and the same Value.

DeleteGroup3 IS BIG O of N + N^2. N for the select. Worse case is that your key is in there N times so N^2 for the removal.

DeleteGroup4 is BIG O of 2N. N to find the indexes and in worst case if you have all items with the same key then its N to remove each of them as RemoveAtIndex is a Big O of 1. This has the best performance if you only know the Key and you have the possibility of having multiple items with the same Key.

If you know for a fact that you won't have duplicate items I would use DeleteGroup2. If you have the possibility of having duplicates DeleteGroup4 should have the best performance.

On a side note if won't have duplicates and you don't necessarily know both the Key and the Value you can still use the best performing option of DeleteGroup2 but create a class called KeyValueIntString with properties of Key and Value. Then overide the IsEquals method so that it only compares the Key property unlike the KeyValue struct that compares both the Key and the Value data members. Then you can use the ObserableCollection.Remove method and not have to worry about knowing the value that is stored. I.E. you could pass in instance of a KeyValueIntString that has the Key set but you don't have to worry about setting the Value property.

After commenting I decided to Add best readability method although it does have worse performance. Has a Big O of N^4. N for select, N for ToList, N for ForEach and N for Remove.

public void DeleteGroup5(KeyValuePair<int, string> groupToDelete)
{
    (
        from g in Groups
        where g.Key == groupToDelete.Key
        select g
    ).ToList().ForEach(g => Groups.Remove(g));
}
Dan P
  • 1,939
  • 3
  • 17
  • 30
  • Wow! Such an answer. I am looking for shorter looking solution - sure LINQ using iterations. It's about readability/maintainability. My collection will rarely have more than 1-2 items and remove operations will almost never be called :) I just wanted to write it shortest and cleanest way. DeleteGroup2 should work! – katit Mar 31 '12 at 02:31
  • In the case of 1-2 items I would go for readability. If you have 1000 or more items I would go for performance. During a lot of code reviews I have seen LINQ used just because it is cool and new and results in very bad performance unless you know how it works when lazy loading is applied or not applied. For instance using the ToList method when using LINQ 2 SQL on an entire table and applying the where clause in code results in horrible performance as primary keys, indexes, and foreign keys aren't used. All comparisons are done client side or web service side. Anyway my 2 cents. – Dan P Mar 31 '12 at 03:33
  • on readability vs performance when using LINQ. You just need to know how and when to apply it. – Dan P Mar 31 '12 at 03:34
  • Absolutely. I'm good with Linq as far as were it goes to DB. I'm trying to learn most efficient ways to write short clear code. – katit Mar 31 '12 at 03:38
  • If readability is of your utmost concern consider this: public void DeleteGroup5(KeyValuePair groupToDelete) { ( from g in Groups where g.Key == groupToDelete.Key select g ).ToList().ForEach(g => Groups.Remove(g)); } It uses a lamda expression for the Remove Item. It has a Big O of N^3. N for the ToList * N for the ForEach and N for the Remove. Again that's worse case scenario with items being in there N times. – Dan P Mar 31 '12 at 03:51