44

Recently, I am doing some simple EF work. Very simple, first,

List<Book> books = entity.Books.WHERE(c=>c.processed==false)ToList();  
then  
    foreach(var book in books)  
    {  
      //DoSomelogic, update some properties,   
    book.ISBN = "ISBN " + randomNumber();  
    book.processed = true;  
    entity.saveChanges(book)  
    }  

I put entity.saveChanges within the foreach because it is a large list, around 100k records, and if this record is processed without problem, then flag this record, set book.processed = true, if the process is interrupted by exception, then next time, I don't have to process these good records again.

Everything seems ok to me. It is fast when you process hundreds of records. Then when we move to 100k records, entity.saveChanges is very very slow. around 1-3 seconds per record. Then we keep the entity model, but replace entity.saveChanges with classic SqlHelper.ExecuteNonQuery("update_book", sqlparams). And it is very fast.

Could anyone tell me why entity framework process that slow? And if I still want to use entity.saveChanges, what is the best way to improve the performance?

Thank you

thunderblaster
  • 918
  • 11
  • 27
user454232
  • 841
  • 2
  • 10
  • 23

7 Answers7

43

Turn off change tracking before you perform your inserts. This will improve your performance significantly (magnitudes of order). Putting SaveChanges() outside your loop will help as well, but turning off change tracking will help even more.

using (var context = new CustomerContext())
{
    context.Configuration.AutoDetectChangesEnabled = false;

    // A loop to add all your new entities

    context.SaveChanges();
}

See this page for some more information.

Steve
  • 6,334
  • 4
  • 39
  • 67
  • 20
    What would be the negative effects of turning this off? – RayLoveless Aug 20 '15 at 15:37
  • @RayLoveless `SaveChanges` will commit every field in every inserted entity to the database. The query will be longer but arguably still faster than running `DetectChanges`, especially for bulk inserts. – Gusdor Feb 23 '16 at 11:33
  • @RayLoveless [Microsoft says](https://msdn.microsoft.com/en-us/data/jj556205.aspx) this "can easily introduce subtle bugs into your application". When I tried it, it took me a while to figure out why my updates stopped working altogether. When I fixed that, it wasn't fast anymore. I suppose it's possibly useful for other scenarios. – Doppelganger Mar 10 '16 at 18:31
  • no change!.. still its taking time. i am saving 1200 records at a time and it is taking 24 seconds to save. – Krunal Shah Aug 08 '18 at 07:28
  • Here is a very clear explanation from Rick Strahl of few approaches to solve this: https://weblog.west-wind.com/posts/2014/dec/21/gotcha-entity-framework-gets-slow-in-long-iteration-loops#Turn-off-Change-Tracking **TLDR**: If writing changes to the DB, recreate the dbContext on each iteration, otherwise turn off change tracking – Victor SDK Jul 15 '21 at 20:06
  • 1
    In my case I used `context.ChangeTracker.AutoDetectChangesEnabled = false;` – tisaconundrum Aug 02 '22 at 21:52
8

I would take the SaveChanges(book) outside of the foreach. Since book is on the entity as a list, you can put this outside and EF will work better with the resulting code.

The list is an attribute on the entity, and EF is designed to optimize updates/creates/deletes on the back end database. If you do this, I'd be curious whether it helps.

Arsman Ahmad
  • 2,000
  • 1
  • 26
  • 34
  • The reason I put savechanges in the foreach is because the records are relative big and the run time is costly (usually takes 4-5 hours). so we decide if updating has no problem, we flag this record as "processed". so if it crashes or anything happens, we dont have to re-update the "good" record again. – user454232 Jan 22 '14 at 16:12
  • That's good thinking, but Entity Framework might handle it better the other way. Entity Framework has the ability to optimize updates and inserts. It also tends to attempt a full commit on the entire record when savechanges() is placed inside a loop such as the one you posted. – Kyland Holmes Jan 24 '14 at 05:57
8

I too may advise you to take the SaveChanges() out of the loop, as it does 'n' number of updates to the database, thus the context will have 'n' times to iterate through the checkpoints and validations required.

var books = entity.Books.Where(c => c.processed == false).ToList();

books.Foreach(b =>
{
    b.ISBN = "ISBN " + randomNumber();
    b.processed = true;
    //DoSomelogic, update some properties  
});
entity.SaveChanges();
Arsman Ahmad
  • 2,000
  • 1
  • 26
  • 34
Dani Mathew
  • 808
  • 10
  • 18
2

The Entity Framework, in my opinion, is a poor choice for BULK operations both from a performance and a memory consumption standpoint. Once you get beyond a few thousand records, the SaveChanges method really starts to break down.

You can try to partition your work over into smaller transactions, but again, I think you're working too hard to create this.

A much better approach is to leverage the BULK operations that are already provided by your DBMS. SQL Server provides BULK COPY via .NET. Oracle provides BULK COPY for the Oracle.DataAccess or unmanaged data access. For Oracle.ManagedDataAccess, the BULK COPY library unfortunately isn't available. But I can create an Oracle Stored Procedure using BULK COLLECT/FOR ALL that allows me to insert thousands of records within seconds with a much lower memory footprint within your application. Within the .NET app you can implement PLSQL Associative arrays as parameters, etc.

The benefit of leveraging the BULK facilities within your DBMS is reducing the context switches between your application, the query processor and the database engine.

I'm sure other database vendors provide something similar.

Charles Owen
  • 2,403
  • 1
  • 14
  • 25
1

"AsNoTracking" works for me

ex:

Item itemctx = ctx.Items.AsNoTracking().Single(i=>i.idItem == item.idItem);
ctx.Entry(itemctx).CurrentValues.SetValues(item);
itemctx.images = item.images;
ctx.SaveChanges();

Without "AsNoTracking" the updates is very slow.

Paulo
  • 19
  • 2
1

I just execute the insert command directly.

//the Id property is the primary key, so need to have this update automatically    
db.Database.ExecuteSqlCommand("SET IDENTITY_INSERT[dbo].[MyTable] ON");
                            
foreach (var p in itemsToSave)
{
    db.Database.ExecuteSqlCommand("INSERT INTO[dbo].[MyTable]([Property1], [Property2]) VALUES(@Property1, @Property2)",
        new SqlParameter("@Property1", p.Property1),
        new SqlParameter("@Property2", p.Property2)"
}

db.Database.ExecuteSqlCommand("SET IDENTITY_INSERT[dbo].[MyTable] OFF");

Works really quickly. EF is just impossibly slow with bulk updates, completely unusable in my case for more than a dozen or so items.

ardmark
  • 81
  • 1
  • 9
  • 1
    Please read the question: they already know that and they're not asking for an alternative outside Entity Framework. Also, it's not about inserts here, but individual update statements. – Gert Arnold Jan 18 '21 at 13:19
  • 1
    The original question does not seem to exclude this answer as a possible solution to improving performance, and running an EF query against the table returns all the new items, so EF will track the queries as usual from that point on. This does seem to work very well with large data sets, so I hope it proves helpful. – ardmark Jan 18 '21 at 19:27
0

Use this Nuget package: Z.EntityFramework.Extensions

It has extension methods that you can call on the DbContext like DbContext.BulkSaveChanges which works amazingly fast.

Note: this is NOT a free package but it has a trial period.