2

I'm running a batch job of inserting events into a database with entity framework. Total size varies but around 3.5 million events are normal. This list is split up into groups of 60-500k events that are then added to the database.

However when around one million events have been added I get System.OutOfMemoryException and have to start over. So to complete the program it has to be run 3-4 times now. How can I clear my db object after SaveChanges() has been called so this won't happen?

public DbSet<Events> Events { get; set; }
...
var groupedList = list.GroupBy(x => x.Import.FileName);
var db = new ApplicationDbContext();

foreach (var group in groupedList)
{
   db.Events.AddRange(group);
   db.SaveChanges();
}

Stack trace:

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.Data.Entity.Core.Mapping.Update.Internal.KeyManager.GetPrincipals(Int32 identifier)
at System.Data.Entity.Core.Mapping.Update.Internal.DynamicUpdateCommand..ctor(TableChangeProcessor processor, UpdateTranslator translator, ModificationOperator modificationOperator, PropagatorResult originalValues, PropagatorResult currentValues, DbModificationCommandTree tree, Dictionary`2 outputIdentifiers)
at System.Data.Entity.Core.Mapping.Update.Internal.UpdateCompiler.BuildInsertCommand(PropagatorResult newRow, TableChangeProcessor processor)
at System.Data.Entity.Core.Mapping.Update.Internal.TableChangeProcessor.CompileCommands(ChangeNode changeNode, UpdateCompiler compiler)
at System.Data.Entity.Core.Mapping.Update.Internal.UpdateTranslator.<ProduceDynamicCommands>d__a.MoveNext()
at System.Linq.Enumerable.<ConcatIterator>d__58`1.MoveNext()
at System.Data.Entity.Core.Mapping.Update.Internal.UpdateCommandOrderer..ctor(IEnumerable`1 commands, UpdateTranslator translator)
at System.Data.Entity.Core.Mapping.Update.Internal.UpdateTranslator.ProduceCommands()
at System.Data.Entity.Core.Mapping.Update.Internal.UpdateTranslator.Update()
at System.Data.Entity.Core.EntityClient.Internal.EntityAdapter.<Update>b__2(UpdateTranslator ut)
at System.Data.Entity.Core.EntityClient.Internal.EntityAdapter.Update[T](T noChangesResult, Func`2 updateFunction)
at System.Data.Entity.Core.EntityClient.Internal.EntityAdapter.Update()
at System.Data.Entity.Core.Objects.ObjectContext.<SaveChangesToStore>b__35()
at System.Data.Entity.Core.Objects.ObjectContext.ExecuteInTransaction[T](Func`1 func, IDbExecutionStrategy executionStrategy, Boolean startLocalTransaction, Boolean releaseConnectionOnSuccess)
at System.Data.Entity.Core.Objects.ObjectContext.SaveChangesToStore(SaveOptions options, IDbExecutionStrategy executionStrategy, Boolean startLocalTransaction)
at System.Data.Entity.Core.Objects.ObjectContext.<>c__DisplayClass2a.<SaveChangesInternal>b__27()
at System.Data.Entity.SqlServer.DefaultSqlExecutionStrategy.Execute[TResult](Func`1 operation)
at System.Data.Entity.Core.Objects.ObjectContext.SaveChangesInternal(SaveOptions options, Boolean executeInExistingTransaction)
at System.Data.Entity.Core.Objects.ObjectContext.SaveChanges(SaveOptions options)
at System.Data.Entity.Internal.InternalContext.SaveChanges()
at System.Data.Entity.Internal.LazyInternalContext.SaveChanges()
at System.Data.Entity.DbContext.SaveChanges()
at Project.Program.Main(String[] args) in C:\Users\oscar\Documents\Solution\Project\Program.cs:line 135
Ogglas
  • 62,132
  • 37
  • 328
  • 418
  • Did you try re-creating an ApplicationDbContext? That means moving the `new ApplicationDbContext()` inside the foreach loop. – Gimly Apr 27 '17 at 09:50
  • Dispose of it and new up a new DbContext. – spender Apr 27 '17 at 09:50
  • Possible duplicate of [Entity Framework 4 Out of Memory on SaveChanges](http://stackoverflow.com/questions/15941471/entity-framework-4-out-of-memory-on-savechanges) – Gimly Apr 27 '17 at 09:52

2 Answers2

5

After inserting one batch you should recreate your DbContext to clear the ChangeTracker. This should not only fix your memory issue, but also make things a bit faster. Something like

foreach (var group in groupedList)
{
   using(var db = new MyDbContext()){
     db.Events.AddRange(group);
     db.SaveChanges();
   }
}
Mats391
  • 1,199
  • 7
  • 12
0

You could also just call.

_context.ChangeTracker.Clear();

This will clear the change tracker and is faster then creating a new context.

Edit: This may only be available in the new versions of .net core.

Dblock247
  • 6,167
  • 10
  • 44
  • 66
  • Faster? Have you got any reference? In the [official docs](https://learn.microsoft.com/en-us/ef/core/change-tracking/) it says "disposing the DbContext is *the normal way* to stop tracking entities". Would they recommend that if it was slower? Creating a new context has always been designed to be a lightweight operation. – Gert Arnold Sep 03 '22 at 18:50
  • I do of lot of data sync between microservices and legacy database. One of the issues we ran into was the change tracker was grown larger and larger slowing down the process. The first thing we tried is to chunk and create scoped context for every chunk. That did speed up the process but not as much as when we keep the same context and dumped the change tracking data with the clear function. Now I can imagine it could have just been my use case but it seems like creating new objects would use more memory then just clearing the change tracking data. That shows more as the record counts increase – Dblock247 Sep 07 '22 at 15:17
  • Full disclosure I am using .net core and dependency injection. With a lot of records being process when I throw away the old context I would have to wait until the garbage collector disposes of the context class. I suppose maybe you have ideas that could have made my process more efficient and I am all ears but that is just what worked best for me. – Dblock247 Sep 07 '22 at 15:29