3

I have a database that users can run a variety of calculations on. The calculations run on 4 different columns each calculation does not necessarily use every column i.e. calculation1 might turn into sql like

SELECT SUM(Column1) 
FROM TABLE 
WHERE Column1 is not null

and calculation2 would be

SELECT SUM(Column2)
WHERE Column2 is null

I am trying to generate this via linq and I can get the correct data back by calculating everything every time such as

table.Where(x => x.Column1 != null)
     .Where(x => x.Column2 == null)
     .GroupBy(x => x.Date)
     .Select(dateGroup => new
             {
               Calculation1 = dateGroup.Sum(x => x.Column1 != null),
               Calculation2 = dateGroup.Sum(x => x.Column2 == null)
             }

The problem is that my dataset is very large, and so I do not want to perform a calculation unless the user has requested it. I have looked into dynamically generating Linq queries. All I have found so far is PredicateBuilder and DynamicSQL, which appear to only be useful for dynamically generating the Where predicate, and hardcoding the sql query itself as a string with the Sum(Column1) or Sum(Column2) being inserted when necessary.

How would one go about dynamically adding the different parts of the Select query into an anonymous type like this? Or should I be looking at an entirely different way of handling this

Alexander Burke
  • 534
  • 7
  • 22
  • Instead of dynamically creating your queries, could you encapsulate them into their own methods and call the methods conditionally? – Jerreck Aug 17 '15 at 19:31
  • Take a look into *Dynamic LINQ*: http://dynamiclinq.azurewebsites.net/. It allows for dynamic clauses throughout many of the LINQ operations, including `Select`. – Nathan A Aug 17 '15 at 19:59
  • Ive looked into Dynamic Linq a bit. I looked more at your link there but getting rid of type safety seems like a choice I only want to make if I have to – Alexander Burke Aug 17 '15 at 20:25

2 Answers2

0

You can return your query without executing it, which will allow you to dynamically choose what to return.

That said, you cannot dynamically modify an anonymous type at runtime. They are statically typed at compile time. However, you can use a different return object to allow for dynamic properties without needing an external library.

var query = table
    .Where(x => x.Column1 != null)
    .Where(x => x.Column2 == null)
    .GroupBy(x => x.Date);

You can then dyamically resolve queries with any one of the following:

  1. dynamic

    dynamic returnObject = new ExpandoObject();
    
    if (includeOne)
        returnObject.Calculation1 = groupedQuery.Select (q => q.Sum(x => x.Column1));
    
    if (includeTwo)
        returnObject.Calculation2 = groupedQuery.Select (q => q.Sum (x => x.Column2));
    
  2. Concrete Type

    var returnObject = new StronglyTypedObject();
    if (includeOne)
        returnObject.Calculation1 = groupedQuery.Select (q => q.Sum(x => x.BrandId));
    
  3. Dictionary<string, int>

David L
  • 32,885
  • 8
  • 62
  • 93
  • I see how that would work if I could do each calculation separately. The issue is that I would have to do a different grouping of calculations that I wont know until the users requests come in. I had thought about hardcoding all the possible combinations but since theres already 14 possible calculations and there could be more, its not feasible. Is there a way around that with your suggestion? – Alexander Burke Aug 17 '15 at 20:10
  • I think there might be, but it would be helpful if you could describe how the requests will come in. Do you have a sample? – David L Aug 17 '15 at 20:25
  • Currently when the backend gets the list of requests there are a few parameters that apply to every calculation, like report date, which are done in a where clause and grouped like you demonstrated. Then there is a list of calculation names and a switch statement that grabs data for the selected calculations. The problem is that to have this data available in the switch statement, I need to have Linq already calculate everything. I cant find a way to do different Select statements on the fly without resorting to hardcoded strings in Dynamic Linq – Alexander Burke Aug 17 '15 at 20:43
  • It ends up being that for one user I might want the Select to be `Select(x => new { calculation1 = Sum(x.column1), calculation2 = Count(x = >x. column2 > 0)})` but for another user I would want the Select to be `Select(x => new { calculation3 = Sum(x.column2)})` Only this is for 14 different calculations and the users can pick anywhere from 0 to all 14 calculations to run – Alexander Burke Aug 17 '15 at 20:45
  • @AlexanderBurke It sounds like you need a richer system for choosing how a user selects a query and how to build those queries out. That's possible, although I suspect somewhere along the way you're going to lose quite a bit of your strong typing since you may need to end up building something very dynamic. That said, based on what you've given so far it's difficult to really improve my answer and make it more useful to you. – David L Aug 18 '15 at 00:16
0

I solved this and kept myself from having to lose type safety with Dynamic Linq by using a hacky workaround. I have a object containing bools that correspond to what calculations I want to do such as

public class CalculationChecks
{
  bool doCalculation1 {get;set;}
  bool doCalculation2 {get;set;}
}

and then do a check in my select for whether or not I should do the calculation or return a constant, like so

Select(x => new 
{
  Calculation1 = doCalculation1 ? DoCalculation1(x) : 0,
  Calculation2 = doCalculation2 ? DoCalculation2(x) : 0
}

However, this appears to be an edge case with linq to sql or ef, that causes the generated sql to still do the calculations specified in DoCalculation1() and DoCalculation2 and then use a case statement to decide whether or not its going to return the data to me. It runs signficantly slower,40-60% in testing, and the execution plan shows that it uses a much more inefficient query.

The solution to this problem was to use an ExpressionVisitor to go through the expression and remove the calculations if the corresponding bool was false. The code showing how implement this ExpressionVisitor was provided by @StriplingWarrior on this question Have EF Linq Select statement Select a constant or a function

Using both of these solutions together is still not creating sql that runs at 100% the speed of plain sql. In testing it was within 10s of plain sql no matter the size of the test set, and the major portions of the execution plan were the same

Community
  • 1
  • 1
Alexander Burke
  • 534
  • 7
  • 22