1

I am using Entity Framework 6 to iterate a large (2.5 million rows) CSV file, and write the updates provided to a database. The problem is that Entity Framework runs out of memory before it can finish creating or updating all the objects.

I have tried:

Implementing a "Detach" function as per How do I clear tracked entities in entity framework but noticed no changes, so I have removed this.

Deleting the context and re-creating it every 10,000 records:

 if ((numRecords % 10000) == 0)
 {
    context.SaveChanges();

    //Recreate Context;
    context.Dispose();
    context = new MyDataEntities();
    context.Configuration.AutoDetectChangesEnabled = false;
 }

However when I take a Memory Snapshot of the program both before this code block runs, and after the GC runs, I notice that the "Object Context" for MyDataEntities now has 2 instances:

ObjectContext   2   311,884 84,585,728

And that the first instance, which should no longer exist, still has memory:

<0x34A73A0> 120 75,894,188
<0xB6C71AC> 311,764 8,691,540

Is there a way I can force the Entity Framework to dispose of the context fully? The instance that should be deleted has 10,000 records attached to it as "EntityWrapperWithoutRelationships"

EntityWrapperWithoutRelationships<MyProject.MyTableObject>    <0xB6A3F2C>

Pseudo Code of what I am doing:

var itemCount = 0;
System.IO.StreamReader file = new System.IO.StreamReader(filepath, System.Text.Encoding.UTF8);
string line;

var context = new MyDataEntities();
context.Configuration.AutoDetectChangesEnabled = false;      

while ((line = file.ReadLine()) != null)
{                 
    //Match to see if exists
    var checkItem = context.MyTable.Where(f => ....).FirstOrDefault();
    if (checkItem != null)
    {
        //Update Data
        checkItem.MyProperty = line[0];
    } else {
        var myObj = context.MyTable.Create();
        myObj.MyPropery = line[0];
        context.MyTable.Add(myObj);
    }
                    
    itemCount++;
                       

    if ((itemCnt % 10000) == 0)
    {
        context.SaveChanges();

        //Recreate Context;
        context.Dispose();
        context = new MyDataEntities();
        context.Configuration.AutoDetectChangesEnabled = false;
    }
}

context.SaveChanges();
context.Dispose();

// be nice and close the file
file.Close();
S Grimminck
  • 516
  • 4
  • 21
  • EF isn't great for doing bulk data imports like this. The Context tends to accumulate cached data, becoming slower and slower. Some options are to completely disable change tracking, batch insert records, disposing, and recreating the context between batches. None of these are super performant though. The idea solution is to use the SQL native bulk data import, which is available to EF via some nuget extensions packages. See: https://github.com/borisdj/EFCore.BulkExtensions – Bradley Uffner Jul 21 '21 at 14:12
  • Does this answer your question? [Fastest Way of Inserting in Entity Framework](https://stackoverflow.com/questions/5940225/fastest-way-of-inserting-in-entity-framework) – Bradley Uffner Jul 21 '21 at 14:14
  • @BradleyUffner I have no issues with the speed of the inserts. I fully realize that Entity Framework is slow for this purpose. The issue I'm concerned about is that the object is remaining in memory after the Dispose and creating a new context. – S Grimminck Jul 21 '21 at 14:17

0 Answers0