1

I have an ASP.NET MVC2 site connecting to a MySQL database via DbLinq. There is a particular set of actions that get done at regular intervals on the site, which includes looping over a particular set of records in a few tables and updating them, and adding some new records in some other tables.

I've been testing with a moderate sized set of data. In my particular test set right now, on an update it ends up inserting 44 new rows and updating 81 other rows. But my call to SubmitChanges() ends up taking a very long time - ~3-4 minutes, which seems like a long time to push (what I thought was) a relatively small number of changes to the DB.

I eventually did some simple profiling, and I found that the problem does not seem to be with executing the query on the database, or even building the query. The majority of the time seems to be taken up by a call inside of UpdateEntity to AllTrackedEntities.ContainsReference().

To give some actual numbers, from a recent test run I had:

  • Time in SubmitChangesImpl: 204884 ms
    • Time in UpdateEntity: 200908 ms
      • Time in ContainsReference: 148173 ms
      • Time in QueryBuilder.GetUpdateQuery: 685 ms
      • Time in QueryRunner.Update: 28 ms
      • Time in UpdateReferencedObjects: 49958 ms

As you can see, building and running the SQL query is dwarfed by the amount of time spent checking to see if there exists a reference to the entity we are updating (if there is no reference, the entity is inserted, although in this case all of the updated entities exist). While I understand why this happens, to maintain data integrity and so on, this is killing the performance of these regular update operations.

I looked at setting ObjectTrackingEnabled to false, but that makes it so the DataContext is read-only, and that is of no use to me - my issue is with the performance on updates specifically.

Is there anything that can be done to improve the performance of updates? Am I using DbLinq in a less than optimal way in terms of trying to push through 40-50 inserts and 80+ updates in a single submit? If so, is there a better way to go about this?

Blobinator
  • 358
  • 1
  • 8
  • Your logic must be getting in the way, show some code - or at least give a concise example which reproduces the issue you are having. I construct complex input forms which end up sending about 18 records for addition into a mysql database using EF 4.1 and it takes roughly 5 - 10 seconds if that. – Travis J Jun 19 '12 at 19:31
  • Do you use the same dataContext over a long period of time, i.e. read some rows, update/add some rows, submit changes, then repeat with the same dataContext? – hatchet - done with SOverflow Jun 19 '12 at 19:34
  • @hatchet - I know this was directed at the OP, but in my repository I have it save changes every time a change is made, and use the same dataContext the whole time. If he is opening and closing a connection a bunch of times that could be contributing to the time. However, it seems to me that there is some sort of `n!` operation occurring. – Travis J Jun 19 '12 at 19:38
  • My understanding of linq to Sql is that once materialized, an entity remains in the context's internal cache. For long-lived contexts, this may accumulate much data that may be stale, and cause more work for its internal change tracking. DataContext is intended to be short-lived...use it and toss it. They are cheap to construct. – hatchet - done with SOverflow Jun 19 '12 at 19:52
  • @hatchet is absolutely correct, my problem turned out to be too long a lifetime on my DataContext. I had thought that, since all of the operations made up a single logical transaction, I was better off with one DataContext, and a single SubmitChanges at the end. I put many SubmitChanges in to test, and then it became clear - each subsequent SubmitChanges was taking longer than the previous, even if it was doing less "work" than the one before. Breaking it up into separate, short-lived DataContexts did the trick. If you want to add that as an answer, I'll accept it as soon as I can. – Blobinator Jun 19 '12 at 20:11
  • @Blobinator - added the answer. Glad that helped. – hatchet - done with SOverflow Jun 19 '12 at 20:24

1 Answers1

1

If you use long-lived DataContexts, where you read data, modify data, submit changes, then repeat using the same DataContext object, this can negatively impact performance. Once a DataContext has materialized an entity, it keeps it internally for the life of the DataContext as part of its object tracking. Over time, this internal cache can become large, in some cases effectively becoming an in-memory cache of a large portion of your database. This can slow things down, and cause more work for the DataContext during SubmitChanges.

DataContext is meant to be short-lived. It's lifespan should be a unit of work. Create it, use it for something, then dispose it.

Here is some more detail:

Why would reusing a DataContext have a negative performance impact?

and this by someone close to the product:

http://blogs.msdn.com/b/dinesh.kulkarni/archive/2008/04/27/lifetime-of-a-linq-to-sql-datacontext.aspx

Long-lived usage:DataContext does not itself overwrite the objects once you retrieve them through queries. So as time passes, the retrieved objects can become stale if they are frequently changed.

Life after SubmitChanges(): DataContext could be used after SubmitChanges() but one has to be careful. SubmitChanges() does all the hard work of figuring out all the changes you have made to the object graph. It orders the CUD operations for you and provides optimistic concurrency check at the granularity of each changed object.

Community
  • 1
  • 1