3

I'am developing a software based on Entity Framework to handle data in a MS SQL Server 2008 database.

[Trouble 1]

I've just tried to insert some small data (about 2 Mb) from my progam to the database : the performance are very bad ! It takes more than 1 minute to insert these datas !

I've try to generate pre-compiled views, I've got the same results :-(

All my code use a business layer (automatically generated from .edmx file with T4 template) to manage data in a service layer. It is very pratical to navigate in the relations of objects.

How can I improve the performance of these inserts with Entity Framework ?

[Trouble 2]

Also, before inserting data in database with SaveChanges() method, I fill my object context with AddObject() method. I add about 100 000 small objects (about 2 Mb) to my object context with AddObject() : it takes a very long time (more than 10 minutes) !

How can I decrease this time ?

UPDATE

My program must save more than 50 Mb in database in less than 2-3 minutes ? Do you think it will be possible with EF ?

Patrice Pezillier
  • 4,476
  • 9
  • 40
  • 50
  • 3
    Entity Framework is a tradeoff between initial ease of use and performance. If you're interested in performance, move away from Entity Framework – Andomar Oct 08 '10 at 13:29
  • 10
    @Andomar, that's wildly unhelpful. This is not normal performance. Removing the EF might not fix it, and in any event would be like curing acne with decapitation. Patrice needs to profile the app to find the problem, not go blaming tools randomly. – Craig Stuntz Oct 08 '10 at 15:03
  • 1
    @Craig Stuntz: So, in your opinion, suggesting someone profile EF is more helpful? Heheh – Andomar Oct 08 '10 at 15:39
  • 2
    @Andomar: It is absolutely more useful for Patrice to profile the app under development than to rewrite infrastructure at random. Architecture by prejudice is rarely a successful strategy. – Craig Stuntz Oct 08 '10 at 15:46
  • 1
    @Craig Stuntz: I was assuming that it's a new app that doesn't work yet, and has insurmountable performance issues. Exactly what you'd expect from an EF app – Andomar Oct 08 '10 at 15:55
  • @Andomar, that *isn't* what I'd expect from an EF app. But, then again, I'm not the sort of person who blames tools for my problems. – Craig Stuntz Oct 08 '10 at 16:01
  • 2 Mb : 20 bytes by object (20 bytes x 100 000 objects = 2 000 000 Mb ) – Patrice Pezillier Oct 08 '10 at 16:39
  • @Patrice: So you are inserting roughly 100k objects that are made up of possibly an Int and maybe a very short string of 16 characters or less? – NotMe Oct 08 '10 at 16:50
  • @Patrice: What is the source of this int + decimal? Why is it that you need to insert 100000 of these pairs? Are you pre-filling a lookup table? Capturing sensor data? Please tell us more about your problem. – Craig Stuntz Oct 08 '10 at 17:15
  • It is scientific results. I need to save it to make further statistics. In fact, I need to save more than 50 Mb in 2-3 minutes. – Patrice Pezillier Oct 08 '10 at 17:18
  • @Patrice - Are you inserting each item into the database separately, or are you adding them all to your data objects and then doing the insert in one SQL statement? If you're doing the first, that could be the cause of your performance. Repeatedly hitting the database will greatly decrease performance. I suggest you profile as @Craig said. – JasCav Oct 08 '10 at 17:29
  • @JasCav: I'm adding them all to my data objects and then doing the insert with SaveChanges() – Patrice Pezillier Oct 08 '10 at 17:44

8 Answers8

4

You could use the Entity Framework Profiler to check what SQL is being generated. This tool has a 30 day free trial. It also claims that it can do "Analysis and detection of common pitfalls when using Entity Framework".

There is also this article on EF tuning

Edit

Based on your edits EF is not the way to go. You need to do a bulk insert in order to get the data in that fast. Have a look at this link where I helped someone reduce load time from 10 hours to 6 mins.

Community
  • 1
  • 1
Shiraz Bhaiji
  • 64,065
  • 34
  • 143
  • 252
  • @Shiraz: Can I mix bulk insert with EF ? – Patrice Pezillier Oct 08 '10 at 17:31
  • @Patrice, it does not look like that is possible, see: http://stackoverflow.com/questions/1609153/how-to-do-a-bulk-insert-linq-to-entities – Shiraz Bhaiji Oct 08 '10 at 17:34
  • 1
    @Shiraz, in EF 4 you can execute "native" SQL server statements via [ObjectContext.ExecuteStoreCommand.](http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.executestorecommand.aspx)That is probably the best way to do bulk loading. But I still wonder about the source. If it were, e.g., CSV files, a SSIS package might be the best fit. – Craig Stuntz Oct 08 '10 at 17:38
  • @Shiraz,@Craig: I need to do bulk *insert* and loading from memory to database. Can I directly use ObjectContext.ExecuteStoreCommand to bulk loading ? What about bulk insert ? – Patrice Pezillier Oct 08 '10 at 17:42
  • 2
    Patrice, it seems to me that your issues are bigger than one performance point. You need an architectural review. I think that's more than can be done in one SO question. One place to start might be this article http://cacm.acm.org/magazines/2010/6/92486-managing-scientific-data/fulltext Although it deals with somewhat larger data scales, it discusses the issues involved. – Craig Stuntz Oct 08 '10 at 17:49
  • 1
    @Craig - That is my guess, too (architectural issues). I am wondering how the database is structured. I have used EF for large amounts of data and have never had this type of problem. – JasCav Oct 08 '10 at 18:34
2

EF (and LINQ2SQL for that matter) does a single update/insert per record. This usually results in less then acceptable performance for bulk inserts/updates. The work around is to create stored procedures, that will do these tasks for you. This is an acceptable solution in most of the cases, when you do these operations on a small number of object types, in which case it is still feasable to use EF, and write those SPs manually, and then add them to the EDMX. You can google for using stored procedures in EF.

MBoros
  • 1,090
  • 7
  • 19
2

There are several possibilities here.

  1. The database hardware might not be up to the task of handling 100,000 inserts. How many tables are involved? Are there ancillary considerations such as triggers that are firing? Is the database memory constrained?

  2. The Web server hardware might not be up to the task of processing that much load. Where is the data originating? How long is it taking to transfer to the web server? How many inserts/sec is the web server actually sending to the database server?

To sum up, you have to profile to figure out exactly where the bottlenecks are. With the information you've given so far it could be anywhere.

You need to run a profiler on the web server, and you need to use SQL Profiler on the database server. Additionally, you should be monitoring both machines CPU, memory, and network usage graphs while loading the data.

Once you have all of that data you should be able to pinpoint where the problem is.

UPDATE
As a side note, EF has to create at least 100,000 objects to hold the data that you are uploading (one for each record). This has it's own overhead which is why ORM's are usually not a good idea for large scale inserting/updating data.

UPDATE 2
If you are running both the client and the database server on the exact same desktop machine then you are likely to have hardware issues. Hopefully you have a multi-core system with at least 4 GB of ram. Everything on your system is going to be competing for resources: visual studio, SQL Server, the app itself, plus whatever else you happen to be running. Studio and SQL Server are memory and CPU hogs. (FYI - SQL server doesn't report everything to task manager)

Point is, unless you are deploying the app with an embedded sql server this isn't going to come close to being a real world test and the performance you are seeing or lack thereof has no relationship to a real world scenario.

NotMe
  • 87,343
  • 27
  • 171
  • 245
  • 1. I'm talking about 2 Mb of data in only 1 table (20 bytes by row). There are no triggers. How can i see if the database memory is constrainted ? This point can explain the very slow AddObject() ? 2. SQL Server is on localhost 3. My program is not a web application. It is an heavy client in c#. – Patrice Pezillier Oct 08 '10 at 16:49
  • @Patrice: pull up task manager and go to the performance tab. When executing the upload, what happens? Is there memory available or is your disk thrashing? Also, if you are running this in debug mode, don't. Run it in release mode. – NotMe Oct 08 '10 at 16:53
  • @Patrice: Then pull out the SQL Profiler and watch the transactions coming across to figure out how it is impacting the rest of the system. – NotMe Oct 08 '10 at 16:59
  • @Chris: Ok for insertion but how can i reduce the AddObject time ? – Patrice Pezillier Oct 08 '10 at 17:02
  • @Patrice: Optimizing EF is a little outside my capabilities. I don't use it because of things like this. My approach to this problem would be to either use a stored procedure that accepts a table-valued parameter containing all of the data I wanted to insert OR to use some type of bulk loading capability. – NotMe Oct 08 '10 at 18:25
1

As others have already pointed out more or less, Entity Framework works by sending one statement for each object and then one additional statement to fetch back ID value assigned by the database.

This is tremendously slow when running on many objects, especially if round-trip time to the database server is anything above 1 millisecond.

In order to improve performance when storing many objects to the database, use SqlBulkCopy class instead of Entity Framework.

You can find full comparison between Entity Framework and SqlBulkCopy in this article: How to Bulk Insert Complex Objects into SQL Server Database

Zoran Horvat
  • 10,924
  • 3
  • 31
  • 43
1

You can profile the queries using SQL Server Profiler and then use the Database Engine Tuning Advisor to determine better statistics and indexes to add.

John Hartsock
  • 85,422
  • 23
  • 131
  • 146
  • @Andomar, you don't know what the problem is. If it's a slow query in an `AFTER INSERT` trigger, adding an index might indeed make the insert faster. "Profile it and find the problem" is the only correct answer here. There's not enough information in Patrice's question to say anything more. – Craig Stuntz Oct 08 '10 at 15:01
1

Chances are that it is not Entity Framework but your database schema that is the culprit, e.g. bad clustered indices or too many indices.

You can see the SQL generated by Entity Framework by using the ObjectQuery.ToTraceString property.

If the SQL generated by Entity Framework isn't satisfactory for your needs you are able to utilize your own stored procedures for create, update and delete operations.

Martin Liversage
  • 104,481
  • 22
  • 209
  • 256
1

In the original question, I get the impression that 100,000 or so objects are all created at once and a single call to SaveChanges is made. This kind of large unit of work where there's a single call to SaveChanges is almost always a very bad performance idea for the Entity Framework. I would try dividing things up into a series of batches with maybe 100 objects or so and call SaveChanges between each batch. In fact I would even dispose of the context and create a new one between batches because otherwise you end up with all of those objects in the state manager at once which imposes overhead. You would probably need to experiment with different batch sizes to find the best number.

Also, you might want to consider what type of entities you are using. If you use poco entities there is some overhead detecting changes which isn't present when you use classes which implement IEntityWithChangeTracking (or inherit from EntityObject which implements that interface).

  • Danny
0

I am experiencing a similar issue with my current project. I attached to SQL Profiler and kicked off my process. It appears that the Entity Framework kicks off an SQL transaction for every INSERT or UPDATE each time changes are saved.

I'm not sure if that has anything to do with the issue, but I would run SQL Profiler and examine the results. I would also monitor your disk usage (reads and writes) on your SQL box (if possible) and see if there are any red flags.

  • Inserts are always part of a transaction. If Entity Framework didn't start a transaction for each insert, SQL Server would start one automatically. In default mode, the SQL Server started transaction would commit immediately after the insert. That's called implicit transactions. – Andomar Oct 08 '10 at 14:12
  • Ah okay. Thanks for the information. So if there are a large amount of INSERT or UPDATE operations taking place, would this have anything to do with the performance? Do SQL transactions add significant performance overhead? – James Smith Oct 08 '10 at 14:30
  • SQL transactions are required, so technically they don't add overhead, they're just part of the insert. For big imports, check out bulk inserts. – Andomar Oct 08 '10 at 15:33
  • I don't know much of EF but shouldn't you explicitly start transaction before starting large insert or update operation? That will save SQL Server to create a new transaction for every row and that will considerably speed up your operation. – Pradeep Oct 10 '10 at 07:43
  • @Pradeep - That's what I was driving at. If there is a large data set (let's just say >= 1000 records) isn't there a performance hit if EVERY SINGLE row runs its own transaction? – James Smith Oct 11 '10 at 13:40