4

On a parametrizable query inspired by this post LINQ group by property as a parameter I've obtained a nice parametrizable query, but with one drawback on performances.

 public static void GetExpensesBy<TKey>( Func<Obj, TKey> myGroupingProperty)
    {
        var query = (from item in dataset
                     orderby item.ExpenseTime descending
                     select item).GroupBy(myGroupingProperty);
        // ....
    }
    // ..
    GetExpensesBy(p=> p.Column)

is much slower than the direct query

 var query = (from item in expense
                     orderby item.ExpenseTime descending
                     select item).GroupBy(p => p.Column);

The difference is about 2s vs 0.1s in a table of 13000 rows.

Would you have any idea how to improve the first syntax to improve the performance?

Community
  • 1
  • 1
Antoine Dussarps
  • 469
  • 1
  • 5
  • 17
  • 2
    That is because your `Func<>` is executed local instead of on the server – Jeroen van Langen Apr 04 '16 at 14:25
  • Please see ["Should questions include “tags” in their titles?"](http://meta.stackexchange.com/questions/19190/should-questions-include-tags-in-their-titles), where the consensus is "no, they should not"! –  Apr 04 '16 at 14:45

2 Answers2

4

Change your parameter type for an Expression:

public static void GetExpensesBy<TKey>( Expression<Func<Obj, TKey>> myGroupingProperty)
{
 //...
}

Passing a Func<T> you are calling GroupBy from IEnumerable<T>

ocuenca
  • 38,548
  • 11
  • 89
  • 102
  • 3
    Relevant: http://stackoverflow.com/questions/793571/why-would-you-use-expressionfunct-rather-than-funct – Oliver Apr 04 '16 at 14:25
3

Hard to tell for certain without knowing what dataset is, but if it is an IQueryable, then one difference between the two is that your first query (since it takes a Func argument) is using the IEnumerable extensions and doing the grouping in-memory. The second example is compiling your lambda to an Expression and thus adds the grouping expression to the base query, passing it down to the provider if possible.

So the difference may be that the second query is grouping in the data source, while the first is pulling in all data and grouping in memory.

Just change you parameter from Func<Obj, TKey> to Expression<Func<Obj, TKey>> and see if that helps.

D Stanley
  • 149,601
  • 11
  • 178
  • 240