I've got a fairly generic CRUD webapp, which generates pages dynamically according to the contents of several database tables. I'm using Entity Framework 4.0 to pull this data out of the DB, however I'm running into severe performance problems. I've managed to iterate down into a problem which is contained enough that I can detail below.
I have a table containing list of Page Forms (~200). Each form has one or more Fields (~4000 total), and each field has may have some Parameters (~16000 total).
I've attached a screenshot of my model below:
The associated entity objects are as follows:
public class Form
{
public int FormID { get; set; }
public string FormName { get; set; }
public IList<FormField> FormFields { get; set; }
}
public class FormField
{
public int FieldID { get; set; }
public string FieldName { get; set; }
public int FormID{ get; set; }
public IList<FormFieldParameter> FormFieldParameters { get; set; }
public Form ParentForm { get; set; }
}
public class FormFieldParameter
{
public int FieldParamID{ get; set; }
public string Value{ get; set; }
public int? FieldID { get; set; }
public FormField ParentField { get; set; }
}
The following code pulls out all data for the Form which has an ID of '1'.
EntityConnection myConnection = new EntityConnection("name=myModel");
if(conn.State != ConnectionState.Open) {
conn.Open();
}
ObjectContext context = new ObjectContext("name=myModel");
context.ContextOptions.LazyLoadingEnabled = false;
ObjectQuery<PageForm> myObjectSet = context.CreateObjectSet<PageForm>()
.Include("FormField.FormFieldParameter");
//Edit: I missed this part out, sorry. In hindsight, this was exactly what was
//causing the issue.
IEnumerable<PageForm> myObjectSetEnumerable = myObjectSet.AsEnumerable();
IQueryable<PageForm> myFilteredObjectSet = myObjectSetEnumerable.Where(c => c.FormID == 1)
.AsQueryable();
List<PageForm> myReturnValue = myFilteredObjectSet.toList();
Now, while this does work, it runs really poorly. The query takes over a second to run, the entirety of which is spent in the myFilteredObjectSet.toList()
call. I ran a profiler on my database to see what was causing the delay, and found that the following query was being generated:
SELECT
[Project1].[FormID] AS [FormID],
[Project1].[FormName] AS [FormName],
[Project1].[C2] AS [C1],
[Project1].[FormID1] AS [FormID1],
[Project1].[FieldID] AS [FieldID],
[Project1].[FieldName] AS [FieldName],
[Project1].[C1] AS [C2],
[Project1].[FieldParamID] AS [FieldParamID],
[Project1].[Value] AS [Value],
[Project1].[FieldID1] AS [FieldID1]
FROM ( SELECT
[Extent1].[FormID] AS [FormID],
[Extent1].[FormName] AS [FormName],
[Join1].[FieldID] AS [FieldID],
[Join1].[FieldName] AS [FieldName],
[Join1].[FormID] AS [FormID1],
[Join1].[FieldParamID] AS [FieldParamID],
[Join1].[Value] AS [Value],
[Join1].[FieldID1] AS [FieldID1],
CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) WHEN ([Join1].[FieldParamID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1],
CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
FROM [dbo].[PageForm] AS [Extent1]
LEFT OUTER JOIN (SELECT [Extent2].[FieldID] AS [FieldID], [Extent2].[FieldName] AS [FieldName], [Extent2].[FormID] AS [FormID], [Extent3].[FieldParamID] AS [FieldParamID], [Extent3].[Value] AS [Value], [Extent3].[FieldID] AS [FieldID1]
FROM [dbo].[FormField] AS [Extent2]
LEFT OUTER JOIN [dbo].[FormFieldParameter] AS [Extent3] ON [Extent2].[FieldID] = [Extent3].[FieldID] ) AS [Join1] ON [Extent1].[FormID] = [Join1].[FormID]
) AS [Project1]
ORDER BY [Project1].[FormID] ASC, [Project1].[C2] ASC, [Project1].[FieldID] ASC, [Project1].[C1] ASC
The duration of this query shown on the sql profiler shows that this query is what is taking so long to run. The interesting thing about the query, is that there is no filtering on it at all - It is returning the entire tree! I can't understand why it is returning everything, as the filter myObjectSet.Where(c => c.FormID == 1)
is pretty explicit. The actual returned object only contains a single entry, which I would expect.
I'm having this problem across my entire data access layer, and its performance is appalling. I have no idea why the generated query doesn't contain the filter - and no idea how to tell it to do so. Does anybody know the answer?