2

I want to return a relatively large number of records from SQL Express 2008 R2 server, via EntityFramework 4 through WCF service to a WCF client. My test table contains around 11.000 records at the moment. The LINQ query is as simple as this:

Database DB = new Database(); // create object context
var retValue = DB.Entities.Persons
        .Include("District")
        .Include("District.City")
        .Include("District.City.State")
        .Include("Nationality")

return retValue.ToList();

This takes about 10 seconds to complete.

The same SELECT query takes less than 1 second when executed in SQL Server Managament Studio.

Does it have to be that slow in EF ?

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
Dejan Janjušević
  • 3,181
  • 4
  • 41
  • 67
  • What else has to happen when this code runs, apart from the query being executed by SQL Server? – AakashM Apr 04 '12 at 10:54
  • @AakashM how do you mean? It's just this one statement which happens at that moment. There is WCF tracing enabled at the moment but I doubt it could add those 9 seconds. – Dejan Janjušević Apr 04 '12 at 10:56
  • Right, I'm asking you to think about what *actually happens* when this code runs, *apart from* the query being executed by SQL Server. – AakashM Apr 04 '12 at 10:58
  • @AakashM just tried without WCF tracing, there is no difference in timing. – Dejan Janjušević Apr 04 '12 at 11:00
  • Did you use SQL Profiler to see how many SQL queries are actually running? I bet you'll see many queries fired at your database. – Steven Apr 04 '12 at 11:05
  • @Steven I used AnjLab.SqlProfiler for SQL Express to see that there are some additional queries related to role and membership provider, however the time elapsed for them was insignificant. The query that retrieves data from DB took ~10 seconds. When I copy the same query, paste it and execute it from within SSMS, it takes less then 1 sec. – Dejan Janjušević Apr 04 '12 at 11:16
  • Please post the exact SQL query. Maybe EF is screwing things up, would not suprise me. When you post the query I'll take a look. – usr Apr 04 '12 at 11:16
  • What happens when you run that same query again within the same transaction or open connection? Does it still run in 9 seconds? Perhaps there is something wrong with your Express configuration. In that case it can take a long time for ADO.NET to connect to Express. – Steven Apr 04 '12 at 11:18
  • you realize this is a SELECT * with 4 left outer joins? Did you run that EXACT same query in SSMS and it takes < 1 second? – RPM1984 Apr 04 '12 at 11:27
  • @Steven, when I run the query for the second time using the same context, the delay is gone. However I have designed all queries so they create a new context on each invoke. Is there any way I can speed it up? – Dejan Janjušević Apr 04 '12 at 11:29
  • @DejanCG: I bet that there is a problem with the configuration of your local SQL Express instance. I've seen this question before at stackoverflow. Try changing the Protocols for SQL Server Express in the Sql Server Configuration Manager. – Steven Apr 04 '12 at 11:49
  • @DejanCG: Can you try to call the query twice, but NOT within the same context. Instead do it in a new context but within the same WCF operation. (Basically just assign `retValue.ToList()` to a local variable and ignore it, then dispose the old context, open a new one and run the query again.) How is the performance of that second query then? – Slauma Apr 04 '12 at 12:40
  • @Steven I tried switching to named pipe, no difference... – Dejan Janjušević Apr 04 '12 at 12:53
  • @Slauma, I tried that and here are the results: 1st context created in 00:00:00.0002040; 1st query in 00:00:10.6595183; 2nd context created in 00:00:00.0000052; 2nd query in 00:00:08.8155699; 2nd query with 1st context in 00:00:00.7618429 – Dejan Janjušević Apr 04 '12 at 13:05

2 Answers2

5

Your query is not simple, it contains a lot of joins (due to the Includes) and more importantly it might return a lot of duplicated data, especially if the included navigation properties are collections: https://stackoverflow.com/a/5522195/270591

The time comsuming part is object materialization and attaching the entities to the context when the result from the database is returned to the Entity Framework context.

This is confirmed by your measurements (in the comments to your question) that a second query within the same context is very fast. In this case EF will perform a query to the database but doesn't need to materialize the objects again because they are still attached to the context.

If you run the second query in a second context the resulting entities must the attached to the new context - and this step is again slow (also confirmed by your measurements).

This is probably a point where a query with EF is in fact slow and adds a lot of overhead compared to a raw SQL query. EF needs to create many data structures prepared for change tracking and managing object identities in the context which consumes additional time.

The only way I can see to improve the performance is disabling change tracking (supposed, you don't need it for your operations). In EF 4.0 / ObjectContext it would be:

Database DB = new Database();
DB.Entities.Persons.MergeOption = MergeOption.NoTracking;
// MergeOption is in System.Data.Objects namespace

When using this approach, one has to be aware though that related objects will be created as separate objects even when they have the same key - which is not the case with enabled change tracking because attaching to the context will avoid this duplication.

So, potentially more objects will be loaded into memory. If this is counterproductive and degrades actually the performance even more or if it still performs better is a matter of a test.

Community
  • 1
  • 1
Slauma
  • 175,098
  • 59
  • 401
  • 420
  • It is not that complicated either because all `Include`s reference navigation properties with many-to-1 relationship. So there would be no duplicate data in rows, only NULL values in columns. While your answer looked like it would nail it, unfortunately after applying NoTracking, the query lasts for more than 1 minute.(!) The service times out because the timeout is set to 00:01:00 – Dejan Janjušević Apr 04 '12 at 13:45
  • @DejanCG: I almost thought (because of he names) that the navigation properties are only single references. I was afraid that NoTracking could be even slower, because, I guess, Nationality, City etc. will be very often the same, so you have a lot of dupliated objects. For the moment I have no idea anymore. – Slauma Apr 04 '12 at 14:03
  • @DejanCG: Just to make sure it works as I expect: Can you call the following after the query (with MergeOption `NoTracking`) is excuted (after `.ToList`): `DB.Entities.ObjectStateManager.GetObjectStateEntries(EntityState.Unchanged).Count()` and check if the result is `0`? Another question: Are you using POCOs or `EntityObject` derived entities? – Slauma Apr 04 '12 at 14:23
  • Thanks for helping me with this. The count returned is `0`. Originally I used `EntityObject`s which had to be converted to my own POCOs before sending back to the client, however when I encountered a large number of records, it has proven to be an unefficient solution. I had to call `ToList` from `IQueryable`, then convert one by one to my POCO using a `foreach` loop, and finally send it all to the client, but serialized. I didn't know which segment was slower. Now I use POCOs generated by EF 4.x POCO Generator. – Dejan Janjušević Apr 04 '12 at 15:21
  • by the way, you were right and I am marking the answer as accepted. After we added `NoTracking` option I didn't measure on WCF side, but on client side. `NoTracking` as expected creates a lot more objects so the serialization must be a LOT slower. The query itself completed in 2 seconds. Now I am opening a new question about that damn serialization which drives me insane. Thanks! – Dejan Janjušević Apr 04 '12 at 15:24
  • It's probably better to change your service contract to provide a skip and a count parameter so that you can do paging through the service. Just sending 10.000 complex object structures over one webservice call is very inefficient. The client seldom needs all 10.000 objects to work. – jessehouwing Apr 04 '12 at 18:04
  • @jessehouwing Thanks for your input. I thought about paging, however I want client to receive the entire result set which would populate the DataGridView and which would allow the client to jump to any record by typing the first letters of some key in a text box. – Dejan Janjušević Apr 04 '12 at 19:13
  • You could accomplish the same features using OData or by fetching the data starting with that letter on request. Or by linking the dataset directly to the data instead of through an object projection. Entity Framework isn't the best technology for querying flat data (as you might have noticed). – jessehouwing Apr 04 '12 at 19:19
  • @jessehouwing I can't use WCF Data Services because I have a specific security demand. I implemented some custom binding with message-based security and role-based operation authorization which is as far as I know not possible under WCF Data Services. So I am stuck with EF and therefore have to take the best out of it. – Dejan Janjušević Apr 04 '12 at 19:32
  • Create a ViewModel specific for your view. See my answer to your other post as well. – jessehouwing Apr 04 '12 at 19:33
0

This is very likely because query compilation (LINQ-query w lots of includes -> SQL to use) is very slow in EF compared to query execution. You can verify if this is the problem by CPU profiling your code. Consider using fewer includes + multiple smaller queries, using compiled query/ies or upgrade to the latest EF5 beta.

Mahol25
  • 3,131
  • 2
  • 22
  • 20
  • Thanks, the solution I have chosen is to create custom ViewModels instead of returning the whole result set. It works much faster, the entire process (query, ToList, serialization, deserialization, converting to DataTable) takes less than 1 second with these 11.000 records. About your EF5 suggestion, I tend to not use anything which is "beta" and besides, the only performance improvement is caching compiled queries. – Dejan Janjušević Apr 05 '12 at 09:35