I have the following code, which is misbehaving:
TPM_USER user = UserManager.GetUser(context, UserId);
var tasks = (from t in user.TPM_TASK
where t.STAGEID > 0 && t.STAGEID != 3 && t.TPM_PROJECTVERSION.STAGEID <= 10
orderby t.DUEDATE, t.PROJECTID
select t);
The first line, UserManager.GetUser
just does a simple lookup in the database to get the correct TPM_USER
record. However, the second line causes all sorts of SQL chaos.
First off, it's executing two SQL statements here. The first one grabs every single row in TPM_TASK
which is linked to that user, which is sometimes tens of thousands of rows:
SELECT
-- Columns
FROM TPMDBO.TPM_USERTASKS "Extent1"
INNER JOIN TPMDBO.TPM_TASK "Extent2" ON "Extent1".TASKID = "Extent2".TASKID
WHERE "Extent1".USERID = :EntityKeyValue1
This query takes about 18 seconds on users with lots of tasks. I would expect the WHERE clause to contain the STAGEID filters too, which would remove the majority of the rows.
Next, it seems to execute a new query for each TPM_PROJECTVERSION
pair in the list above:
SELECT
-- Columns
FROM TPMDBO.TPM_PROJECTVERSION "Extent1"
WHERE ("Extent1".PROJECTID = :EntityKeyValue1) AND ("Extent1".VERSIONID = :EntityKeyValue2)
Even though this query is fast, it's executed several hundred times if the user has tasks in a whole bunch of projects.
The query I would like to generate would look something like:
SELECT
-- Columns
FROM TPMDBO.TPM_USERTASKS "Extent1"
INNER JOIN TPMDBO.TPM_TASK "Extent2" ON "Extent1".TASKID = "Extent2".TASKID
INNER JOIN TPMDBO.TPM_PROJECTVERSION "Extent3" ON "Extent2".PROJECTID = "Extent3".PROJECTID AND "Extent2".VERSIONID = "Extent3".VERSIONID
WHERE "Extent1".USERID = 5 and "Extent2".STAGEID > 0 and "Extent2".STAGEID <> 3 and "Extent3".STAGEID <= 10
The query above would run in about 1 second. Normally, I could specify that JOIN
using the Include
method. However, this doesn't seem to work on properties. In other words, I can't do:
from t in user.TPM_TASK.Include("TPM_PROJECTVERSION")
Is there any way to optimize this LINQ statement? I'm using .NET4 and Oracle as the backend DB.
Solution:
This solution is based on Kirk's suggestions below, and works since context.TPM_USERTASK
cannot be queried directly:
var tasks = (from t in context.TPM_TASK.Include("TPM_PROJECTVERSION")
where t.TPM_USER.Any(y => y.USERID == UserId) &&
t.STAGEID > 0 && t.STAGEID != 3 && t.TPM_PROJECTVERSION.STAGEID <= 10
orderby t.DUEDATE, t.PROJECTID
select t);
It does result in a nested SELECT
rather than querying TPM_USERTASK
directly, but it seems fairly efficient none-the-less.