5

I have a query that joins two tables that do not have a defined relationship in Entity Framework, and where the joined table has a one-to-many navigational property to a third table.

There is a one-to-many relationship between msg and job, but there isn't a foreign key, and there isn't an association defined in the .EDMX.

There is a one-to-many relationship between job and lock, and there is an association defined in the .EDMX, so job has a job.locks navigation propertym and lock has a lock.job navigation property.

My original query:

var msgsAndJobs = (
    from m in dbContext.msgs
    join j in dbContext.jobs
        on new { jobid = m.jobid, priority = m.priority }
        equals new { jobid = j.jobid, priority = j.priority }
    where m.msgtype == "EMERGENCY"
    orderby new { m.recvdt }
    select new { m, j }
    );

I'm finding that EF is generating one query for the join, then executing a second query to populate the navigational property for each and every record returned by the join.

Microsoft's docs explain the issue: https://msdn.microsoft.com/en-us/data/jj574232.aspx

So I thought that I'd be able to use an .Include() clause, to eagerly load the records involved. But it doesn't seem to be working:

My new query:

var msgsAndJobs = (
    from m in dbContext.msgs
    join j in dbContext.jobs.Include("locks")
        on new { jobid = m.jobid, priority = m.priority }
        equals new { jobid = j.jobid, priority = j.priority }
    where m.msgtype == "EMERGENCY"
    orderby new { m.recvdt }
    select new { m, j }
    );

And it's still generating a query for each job lock.

Any ideas what I'm doing wrong?

Jeff Dege
  • 11,190
  • 22
  • 96
  • 165

1 Answers1

8

That's the problem with Include. It's too easy to make it ineffective and it's not always clear why it doesn't work. One thing that breaks an Include is changing the shape of the query. Another one is projecting to a non-entity type or an anonymous type.

This seems to make it incredibly hard to predict when Include does and doesn't work, but there is one simple trick: Include always works if you apply it at the end of the query.

If you can't apply it there, it wouldn't have been effective anyway.

With this in mind, if we look at your case, it's clear why Include doesn't work. You can't do

(... select new { m, j }).Include("locks"); // Runtime error

because locks isn't a navigation property of the anonymous type, obviously. It's more evident if you use the lambda version:

(... select new { m, j }).Include(x => x.locks); // Doesn't compile

So your Include is ineffective and the locks are loaded on demand.

Fortunately, there is a way out due to relationship fixup, the process by which EF knits together entities in the context by their navigation properties. Change your query into this:

var msgsAndJobs = (
    from m in dbContext.msgs
    join j in dbContext.jobs
        on new { jobid = m.jobid, priority = m.priority }
        equals new { jobid = j.jobid, priority = j.priority }
    where m.msgtype == "EMERGENCY"
    orderby new { m.recvdt }
    select new { m, j, j.locks }
    ).AsEnumerable()
    .Select(x => new { x.m, x.j });

If you execute this query, .AsEnumerable() causes the result to be loaded into the context, after which you can select the result you originally wanted. Now you'll notice that EF has populated all job.locks collections.

One important thing though: you have to disable lazy loading, otherwise addressing a job.locks collection will still trigger lazy loading. This is because, even though the collection is populated, it is not marked as Loaded internally.

Community
  • 1
  • 1
Gert Arnold
  • 105,341
  • 31
  • 202
  • 291