0

I am populating my database from online data using the loop like (simplified, with no error checking):

foreach (var catalog in catalogs)
{
   var result = Items(catalog, state, context);
   while (result != null)
   {
      result.ForEach(r => context.DbContext.Items.Add(r));
      context.DbContext.SaveChanges();
      result = Items(catalog, state, context);
   }
}                  

Code takes some time to get XML response from the server and to decode it into XElement, using XElement.Load on response stream. It is decoded into a list of items which contains max 50 items - that is what I requested from the server in each loop pass. That chunk gets saved to a table right away because of SaveChanges call..

8/10 of the loop time is spent either on adding items to DbContext or in SaveChanges call or on both. Communication with the remote server and decoding response XML into a list of entities is 2/10.

How can I increase the efficiency of storing data into the database, while still staying with EF?

I am aware that I can bulk-load the database from XML, but that will force me to figure out SQL statements that I need to write, because several related tables get updated with the SaveChanges call above, and so I start losing the advantages of using EF.

Tony
  • 1,566
  • 2
  • 15
  • 27

1 Answers1

2

In short: You cannot speed your insertion process with pure EF because EF has very poor performance for bulk / batch data processing. You have two problems:

  • Adding entity to context has some costs and this costs increases with every entity already present in the context. To avoid this you can try to call SavaChanges after each call to Addor even try to use new context for every batch or even every call to Add.
  • EF makes a separate database roundtrip for every record you want to insert, update or delete so it generally doesn't matter how often do you call SaveChanges. Avoiding this is mostly possible only when using direct SQL and creating single SqlCommand executing all inserts at once.

If you want to increase performance use direct SQL.

Community
  • 1
  • 1
Ladislav Mrnka
  • 360,892
  • 59
  • 660
  • 670
  • Sigh. Initially I expected that constructing XElement from a stream and then creating a list of POCO objects from its XElement children would be the bottleneck! – Tony Mar 07 '12 at 15:41