1

I have a following query which runs very fast:

var query =
    (from art in ctx.Articles

     join phot in ctx.ArticlePhotos on art.Id equals phot.ArticleId
     join artCat in ctx.ArticleCategories on art.Id equals artCat.ArticleId
     join cat in ctx.Categories on artCat.CategoryId equals cat.Id
     where art.Active && art.ArticleCategories.Any(c => c.Category.MaterializedPath.StartsWith(categoryPath))
     orderby art.PublishDate descending
     select new ArticleSmallResponse
     {
         Id = art.Id,
         Title = art.Title,
         Active = art.Active,
         PublishDate = art.PublishDate ?? art.CreateDate,
         MainImage = phot.RelativePath,
         RootCategory = art.Category.Name,
         Summary = art.Summary
     })
     .AsNoTracking().Take(request.Take);

However, if I add group by and change query to following statement, it runs much much slower.

var query =
    (from art in ctx.Articles

     join phot in ctx.ArticlePhotos on art.Id equals phot.ArticleId
     join artCat in ctx.ArticleCategories on art.Id equals artCat.ArticleId
     join cat in ctx.Categories on artCat.CategoryId equals cat.Id
     where art.Active && art.ArticleCategories.Any(c => c.Category.MaterializedPath.StartsWith(categoryPath))
     orderby art.PublishDate descending
     select new ArticleSmallResponse
     {
         Id = art.Id,
         Title = art.Title,
         Active = art.Active,
         PublishDate = art.PublishDate ?? art.CreateDate,
         MainImage = phot.RelativePath,
         RootCategory = art.Category.Name,
         Summary = art.Summary
     })
     .GroupBy(m => m.Id)
     .Select(m => m.FirstOrDefault())
     .AsNoTracking().Take(request.Take);

Homepage calls query 9 times for each category. With the first version of query, without caching turned on and connecting to SQL remotely, page load is around 1.5 seconds, which makes it almost instant when application is on server, but second way makes homepage load around 39 seconds when SQL is remotely.

Can it be fixed without rewriting the entire query in to the view or stored procedure?

stop-cran
  • 4,229
  • 2
  • 30
  • 47
Robert
  • 3,353
  • 4
  • 32
  • 50

1 Answers1

1

Grouping is an expensive operation on the database end. Without knowing what your database looks like and what indexes you've setup, it will be difficult to determine. Why not just group on the client side after the data has arrived (assuming its not an overwhelming amount).

This question explains how. Group by in LINQ

AngryHacker
  • 59,598
  • 102
  • 325
  • 594
  • I had that idea too, but then skip/take pagination would not be correct, right? – Robert Sep 14 '17 at 07:11
  • @Robert In that case, you should fire up the SQL Profiler and see what sql Entity Framework is generating. Then take that sql and run it in SSMS to see the execution plan. That will show you what the performance bottleneck is. – AngryHacker Sep 14 '17 at 17:46
  • Everything went speedlight, when i added correct indexes to relational tables. – Robert Sep 14 '17 at 17:47