1

So, I've got a fairly large relational database that I'm driving with EntityFramework in C#/.Net. I'm trying to optimize it to make it better at getting only select data.

I am using .AsNoTracking() as a bool input to the repository before trying to limit any data.

Up until now, I've been using repositories where All of the related tables are included in the query, even when I only want to search for certain things.

I've basically got a massive list of .Include(...) before returning an IQueryable<Product>.

But say I only want to return a couple of particular tables(as those are the only ones I need to display certain data), how is the best way of doing this?

So far, I've got my repository taking in a list of enum values for the relations available to that entity.

I've implemented looping through this list, and trying to do a .Include(...) on each required relation, but this appears to take ALOT longer than loading everything(sometimes, even timing out and giving me a SQL timeout). A short snippet of this is below.

query is an IQueryable<Product>

foreach (var part in this.IncludeList)
{
    switch(part)
    {
        case ProductPart.Tag:
        query = query.Include(p => p.Tags);
        break;
        case ProductPart.Video:
        query = query.Include(p => p.Videos);
        break;
    }
}

Is there a knack to using this out of chain that I'm missing?

Erik Philips
  • 53,428
  • 11
  • 128
  • 150
Matt Ellis
  • 87
  • 1
  • 2
  • 7
  • Is it possible to trace the generated queries on the database? We had a similar problem recently where Entity Framework got all subentities by lazy loading even when we used Include. We were able to solve this by putting Context.Configuration.LazyLoadingEnabled = false; found at this link: http://stackoverflow.com/questions/16798796/ef-include-with-where-clause – Bruno V Oct 29 '14 at 11:04
  • @BrunoV Thanks for the comment! Unfortunately, i don't think that will work. I have turned off lazyloading using that exact same code-block `this.Context.Configuration.LazyLoadingEnabled = false;` Might it be something caused by me using `return query.AsQueryable();` so i can make use of deferred execution? – Matt Ellis Oct 29 '14 at 11:41

1 Answers1

-1

keep in mind LINQ use deferred execution plan. With your loop you are expecting EF to create independent queries and load Tags, then Videos. This is not what will happen. The resulting query will be equivalent to :

query = query.Include(p => p.Tags).Include(p => p.Videos);

This may result in a pretty big SQL transaction.

if you want to optimize your query, you should try the following :

foreach (var part in this.IncludeList)
{
    switch(part)
    {
        case ProductPart.Tag:
        query.SelectMany(p => p.Tags).Load();
        break;

        case ProductPart.Video:
        query.SelectMany(p => p.Videos).Load();
        break;
    }
}

The result will be more efficient as each query will be executed independently and navigation properties will be mapped on the client side by entity framework.

Seb
  • 1,230
  • 11
  • 19
  • 2
    Getting all of the data in several round trips, instead of one, is going to make it take *longer*. You're adding in the latency of two round trips while pulling the same amount of data. – Servy Oct 29 '14 at 16:03
  • @Servy : in the past, using this strategy I was able to cut the transaction size from 10mb to 500k (according to fiddler) on SQL server. I will do some benchmarking to confirm what you are saying is right. – Seb Oct 29 '14 at 16:35
  • If that's the case then you were changing something beyond just the transformation you've described here (or weren't measuring the results properly). Loading the *exact same data* in several round trips won't reduce the amount of data loaded by an order of magnitude. – Servy Oct 29 '14 at 16:36
  • That's the point : less data is loaded from the server. The result of `query` will be loaded once in my case. Using join, each rows may be replicated many times in the resulting dataset (`p.Tag.Count()` times `p.Video.Count()`) Maybe I'm wrong, I will confirm that – Seb Oct 29 '14 at 16:41
  • It was using york sorry, not fiddler – Seb Oct 29 '14 at 16:48
  • Ok, made some test on one of my db : SelectMany (4.3Mb, 7996ms) ... EF Include (11Mb, 10 593ms) It's something – Seb Oct 29 '14 at 17:24
  • And SelectMany allow Where clauses when Include does not, which allow further optimizations – Seb Oct 29 '14 at 17:26
  • Again, I imagine that you're either not doing what you claim to be doing, or your not measuring the results properly. The most likely mistake would be not materializing all of the data in each case when you intended to, but there are so many possible benchmarking errors, especially with something this complex, that it's impossible to really say without an actual example. – Servy Oct 29 '14 at 17:27
  • I'm agreeing with you. Maybe I should start a programming review question and do a proper bench. Regards. – Seb Oct 29 '14 at 17:36
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/63884/discussion-between-seb-and-servy). – Seb Oct 29 '14 at 18:23
  • @Seb Thanks for the discussion guys! Those figures look impressive, and i'll be sure to take a look at the approach in this example. Cheers! – Matt Ellis Oct 30 '14 at 09:04
  • @Seb I've been looking at the implementation i've got, and this approach doesn't offer me any benefit. It seems to go in the opposite direction. – Matt Ellis Nov 04 '14 at 16:52