1

I've got a piece of code that executes a pretty big query, with lots of joins, etc. For simplicity's sake, here's more or less the structure:

var rootQuery = MainQuery(); // IQueryable<MyClass>
var jq1 = JoinQuery1(); // IQueryable<anonymous type>
var jq2 = JoinQuery2();
...
var jq6 = JoinQuery6();
var bigQuery = from x in rootQuery
               join j1 in jq1 on x.ID equals j1.MainID into join1
               from j1 in join1.DefaultIfEmpty()
               join j2 in jq2 on x.ID equals j2.MainID into join2
               from j2 in join1.DefaultIfEmpty()
               ...
               join j6 in jq6 on x.ID equals j6.MainID into join6
               from j6 in join1.DefaultIfEmpty()
               select new {
                            x.ID,
                            x.Field1,
                            j1.FieldA,
                            j2.FieldB,
                            ...
                            j6.FieldF
                          };
var sw = Stopwatch.StartNew();
var loadedData = bigQuery.ToList();
sw.Stop();
Console.WriteLine("Time Elapsed = {0} ms", sw.ElapsedMilliseconds);

The stopwatch is showing over 30 seconds elapsed, returning 9 rows and about 30 columns of data (with no text fields). So I sniffed the query using SQL Server Profiler, and indeed it is a monster. I fiddled and prodded and optimized a few indexes, and got the query to execute in under 200 milliseconds. But when I run the code, it is still taking over 30 seconds to run .ToList(), even though SQL Profiler says the query part is taking under 200 ms!

What is going on here? Why is .ToList taking so long to load such a small data set into memory?

Edit: I worked around the problem (see my answer below), but I'm not satisfied. Answer credit to anyone who can propose a better way, or at least explain why the object materialization is so expensive that it works out cheaper to run 7 separate queries and join them in local memory.

Community
  • 1
  • 1
Shaul Behr
  • 36,951
  • 69
  • 249
  • 387

1 Answers1

1

This seems very counter-intuitive, but I solved the problem by running the IQueryables separately (using Jon Skeet's neat NullOr extension):

var rootQuery = MainQuery().ToList();
var jq1 = JoinQuery1().ToList();
var jq2 = JoinQuery2().ToList();
...
var jq6 = JoinQuery6().ToList();
var bigQuery = from x in rootQuery
           join j1 in jq1 on x.ID equals j1.MainID into join1
           from j1 in join1.DefaultIfEmpty()
           join j2 in jq2 on x.ID equals j2.MainID into join2
           from j2 in join1.DefaultIfEmpty()
           ...
           join j6 in jq6 on x.ID equals j6.MainID into join6
           from j6 in join1.DefaultIfEmpty()
           select new {
                x.ID,
                x.Field1,
                FieldA = j1.NullOr(j=>j.FieldA),
                FieldB = j2.NullOr(j=>j.FieldB),
                ...
                FieldF = j6.NullOr(j=>j.FieldF)
              };

The SQL side now takes longer in total (more round trips), but at least the Linq-To-Entities side is practically instantaneous.

Answer credit to anyone who can explain this weird behavior!

Community
  • 1
  • 1
Shaul Behr
  • 36,951
  • 69
  • 249
  • 387