0

I have an application which does a lot of calculations based on prices, which takes ~87 seconds at the moment. Based on the timeline, I suspect that I am using my includes in a way that is highly suboptimal for Entity Framework, and that I can make major improvements.

My timeline is like this:

  • 0 sec - receive GET request
  • 2 sec - send query to SQL Server (query with all necessary incudes, the result of which is >40.000 characters. I added some parts at the bottom of the question)
  • 5 sec - receive query result from SQL Server
  • 45 some thread is exiting
  • 46 some thread is exiting
  • 50 some thread is exiting
  • 80 closing SQL connection (I think that it takes EF6 ~75 seconds to map the query result back to my C# classes but I am not sure)
  • 80 starting calculations
  • 87 all calculations are done

The query is designed mostly like this

var result = await db.MyAuthorizedFooExtension()
    .Include(x => x.Foofie)
    .Include(x => x.Bar.Baz)
    .Include(x => x.FooFoo.Select(y => y.BarBar))
    .Include(x => x.FooFoo.Select(y => y.BazBaz))
//etc.
    .ToListAsync()

The result looks like this:

[UnionAll7].[C9] AS [C55], 
...
   FROM  (SELECT 
        CASE WHEN ([Join2].[ID1] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1], 
        [Extent1].[ID] AS [ID], 
...
        [Join2].[ID1] AS [ID5], 
...
        CAST(NULL AS varchar(1)) AS [C4], 
...
        FROM   [dbo].[Contract] AS [Extent1]
        INNER JOIN [dbo].[Leverancier] AS [Extent2] ON [Extent1].[Leverancier] = [Extent2].[ID]
        LEFT OUTER JOIN  (SELECT [Extent3].[ID] AS [ID1], [Extent3].[Contract] AS [Contract], [Extent3].[ContractStatusOud] AS [ContractStatusOud], [Extent3].[ContractStatusNieuw] AS [ContractStatusNieuw], [Extent3].[MutatieDatumTijd] AS [MutatieDatumTijd], [Extent3].[Medewerker] AS [Medewerker], [Extent3].[Status] AS [Status1], [Extent3].[LastModifiedDate] AS [LastModifiedDate1], [Extent3].[LastModifiedBy] AS [LastModifiedBy1], [Extent4].[ID] AS [ID2], [Extent4].[Naam] AS [Naam], [Extent4].[WindowsNaam] AS [WindowsNaam], [Extent4].[Rol] AS [Rol], [Extent4].[Unit] AS [Unit], [Extent4].[Status] AS [Status2], [Extent4].[LastModifiedDate] AS [LastModifiedDate2], [Extent4].[LastModifiedBy] AS [LastModifiedBy2]
            FROM  [dbo].[ContractWijzigingHistorie] AS [Extent3]
            INNER JOIN [dbo].[Medewerker] AS [Extent4] ON [Extent3].[Medewerker] = [Extent4].[ID] ) AS [Join2] ON [Extent1].[ID] = [Join2].[Contract]
        WHERE ( EXISTS (SELECT 
            1 AS [C1]
            FROM [dbo].[Contract] AS [Extent5]
            WHERE ( EXISTS (SELECT 
                1 AS [C1]
                FROM [dbo].[Leverancier] AS [Extent6]
                WHERE [Extent6].[ID] = [Extent5].[Leverancier]
            )) AND ([Extent5].[ID] = [Extent1].[ID])
        )) AND ([Extent1].[Status] IN (1)) AND ((DATEPART (year, [Extent1].[Startdatum])) <= @p__linq__0) AND ((DATEPART (year, [Extent1].[Einddatum])) >= @p__linq__1)
    UNION ALL
        SELECT 
        2 AS [C1], 
        [Extent7].[ID] AS [ID], 
...
        CAST(NULL AS int) AS [C9], 
        CAST(NULL AS datetime2) AS [C10], 
...
        [Extent9].[PeriodeWaarde] AS [PeriodeWaarde], 
...
        [Project9].[LastModifiedBy] AS [LastModifiedBy], 
...
        [UnionAll5].[ID2] AS [C60], 
...
        CAST(NULL AS int) AS [C62], 
...
etc.

How do I improve the performance while still using Entity Framework?

Daniël Camps
  • 1,737
  • 1
  • 22
  • 33
  • 3
    At a very minimum you should [include the actual Execution Plan](https://stackoverflow.com/a/7359705/1260204), you could use [Paste the Plan](https://www.brentozar.com/pastetheplan/) and share the link in your question. Also [try to read it yourself](https://stackoverflow.com/a/759097/1260204), maybe you can figure out the performance issue(s) with your query. – Igor Oct 06 '17 at 13:58
  • Is using a view or UDF an option? – DiskJunky Oct 06 '17 at 13:58
  • @Igor I will create a plan right now – Daniël Camps Oct 06 '17 at 14:00
  • @DiskJunky I don't see why not – Daniël Camps Oct 06 '17 at 14:00
  • If performance is critical then getting as close as possible to SQL would be the way to go. Once you have your view or UDF set up, you can add it to Entity Framework like a table and use it from there. It also gives you better control over the query plan to structure your queries better – DiskJunky Oct 06 '17 at 14:01
  • Check out this [article](http://www.sqlservercentral.com/articles/Entity+Framework/141158/?utm_source=SSC&utm_medium=pubemail) – Jarealist Oct 06 '17 at 14:21
  • Without knowing the size of the tables or the columns, two things jump at me. 1) You are probably pulling a LOT of data with all those includes and 2) The fact it takes 75 seconds to close the connection seems to indicate that number 1 is true. Maybe you should consider reworking your query to pull only what you need. You should aim to do as much filtering and ordering as possible at the SQL side. – JuanR Oct 06 '17 at 14:24
  • This is one of many possible duplicates. The message is: reduce the number of `Includes`, but since you don't show the actual code, we can't tell you how. – Gert Arnold Oct 06 '17 at 14:37

1 Answers1

0

This begs the question of what are you doing after you get the result of the query? If you read out your query, you will see that you are getting all of the results whilst joining on to the relevant tables eagerly.

If you are filtering this data down after you retrieve the data from the query, then you should be writing your query in a way where you are filtering the data before materialising it (calling ToList). Do you also need to call the related tables? You should also only Select from the database the columns you know you will use.

Now, again it would be easier to know what you want your method to do. Create a new object or an anonymous object to select into:

var result = await db.MyAuthorizedFooExtension()
//etc.
.Select(s => new
{
    Id = s.Id,
    FooName = s.FooName,
    BazId = s.Bar.Baz.Id
}
.ToListAsync();

And materialise at the end. If you require Includes in optional scenarios, make the query an IQueryable, perform you logic and materialise at the end.

nogalskis
  • 43
  • 4
  • There are a lot of complex calculations following the query (too many to post in the question). I guess I only need ~80% of the columns of each table, so there could be some room for improvement there, but I think that there are bigger optimizations I think that I can make with less effort and much more result. Materializing the query is a conscious decision, running all of the subsequent calculations from the materialized list is significantly faster than further building all subsequent steps on the IQueryable (I tried this before) – Daniël Camps Oct 06 '17 at 14:12
  • The first thing I'd recommend you do is optimise the data you retrieve from the query. Make sure you aren't making calls to joining tables that you know you do not need. You can also take a look at the Execution Plan to see what calls take the most time. You can also try and add Indexes to the tables you are querying to see if this will improve execution time. – nogalskis Oct 06 '17 at 14:30
  • Excellent point :) but I'm positive that I need all the tables I'm joining. Execution plan is something I will definitely look into – Daniël Camps Oct 06 '17 at 14:33
  • Even if you need the tables you are joining by eager loading, you don't have to eager load. By selecting the data in to an object before you materialise, you will not be loading all of the records from all of the tables but only the ones you require. You will (I believe) also be able to use LINQ's Where action to select only what you need, something I don't think you can do in the Include. Obviously, in the case where you try and use records which you have selected, Lazy Loading will kick in and will perform more calls to the database, which may also have a negative effect. Bear in mind. – nogalskis Oct 06 '17 at 14:39