0

I had some serious speed issues with the LINQ in this code (variable names have been changed)

var A = _service.GetA(param1, param2); // Returns Enumerable results

var results = (from b in _B.All() // _B.All() returns IQueryable
        join c in _C.All() on b.Id equals c.Id // _C.All() returns IQueryable
        join a in A on a.Id equals c.Id
        where b.someId == id && a.boolVariable // A bool value
        select new
        {
          ...
        }).ToList();

This LINQ took over 10 seconds to execute even though the number of rows in B and C tables were less than 100k.

I looked into this and by trial and error I managed to get the LINQ execution time to 200ms by changing the code to this:

var A = _service.GetA(param1, param2).Where(a => a.boolVariable); // Returns Enumerable results

var results = (from b in _B.All() // _B.All() returns IQueryable
        join c in _C.All() on b.Id equals c.Id // _C.All() returns IQueryable
        join a in A on a.Id equals c.Id
        where b.someId == id
        select new
        {
          ...
        }).ToList();

So my question is, why does this simple change have such drastic effects on the LINQ performance? The only change is that I filter the Enumerable list beforehand, the A enumerable has about 30 items before filtering and 15 after filtering.

David Tansey
  • 5,813
  • 4
  • 35
  • 51
Dadoss
  • 17
  • 3
  • 1
    When you ran a SQL Trace for the original code, what was being submitted to the database? When you ran a SQL Trace for the final code, what was being submitted to the database? – mjwills Nov 15 '18 at 20:28
  • I'm guessing that the order of `a.boolVariable` and `b.someId == id` might be reversed or something. – Ringil Nov 15 '18 at 20:33
  • Please update your question to show exactly what was submitted to the database in both instances. I think you forget to edit your question to include it. – mjwills Nov 15 '18 at 20:47
  • Please also show us the source code for `GetA`. – mjwills Nov 15 '18 at 20:48
  • @Ringil Why do you think the order of the conditions has such effect on the speed? – Dadoss Nov 15 '18 at 20:48
  • @mjwills I think the source of GetA is irrelevant as it does not affect the speed of the LINQ. I already checked the speed of the GetA function and it's not a problem. – Dadoss Nov 15 '18 at 20:50
  • 1
    @Dadoss If you aren't willing to show the source code, your question is much more likely to be closed. It is crucial that you show a [mcve]. Many people asking questions think certain things are irrelevant. And they are often incorrect. It is crucially relevant. You are tying our hands behind our back when we are trying to help you. The mistake is **almost certainly** because that method returns `IEnumerable`. – mjwills Nov 15 '18 at 20:51
  • Possible duplicate of [Returning IEnumerable vs. IQueryable](https://stackoverflow.com/questions/2876616/returning-ienumerablet-vs-iqueryablet) – mjwills Nov 15 '18 at 20:53
  • @mjwillls I understand, I'll see if I can post the source code tomorrow when I'm at work, I don't have access to it now. – Dadoss Nov 15 '18 at 21:31

1 Answers1

2

In your first scenario: first it joins all the records in A which would take long time to join, then filters out for a.boolVariable.

In your second scenario you have a smaller subset of records for A prior to joining - of course this would take less time to join.

Richard Mneyan
  • 656
  • 1
  • 13
  • 20
  • 2
    Even if you have 10 records vs 5 that would mean 10 times all other 100K records in those B & C vs 5 times 100K. Not to mention that all those records afterwards need to be filtered out (filtering 10 records vs 10000 records). – Richard Mneyan Nov 15 '18 at 20:06
  • @Dadoss So basically when you mix both IQueryable and IEnumerable there is a gray line on what will and it will not execute on the sql server. Now by using the IENumerable in the where clause (first case) it will force part of the query to be done into the memory. But in the second case the Entity Framework will make the sql a little bit better by making the last join into a virtual select or completely taking it off and adding it as where (in). All these though is just speculations taken out of various linq queries that i tried over the years – Nick Polyderopoulos Nov 15 '18 at 23:11