3

I have a table that contains greater than half a million records. Each record contains about 60 fields but we only make changes to three of them.

We make a small modification to each entity based on a calculation and a look up.

Clearly I can't update each entity in turn and then SaveChanges as that would take far too long.

So at the end of the whole process I call SaveChanges on the Context.

This is causing an Out of Memory error when i apply SaveChanges

I'm using the DataRepository pattern.

//Update code
DataRepository<ExportOrderSKUData> repoExportOrders = new DataRepository<ExportOrderSKUData>();
foreach (ExportOrderSKUData grpDCItem in repoExportOrders.all())
{
  ..make changes to enity..
}
repoExportOrders.SaveChanges();



//Data repository snip
public DataRepository()
{
  _context = new tomEntities();
  _objectSet = _context.CreateObjectSet<T>();
}
public List<T> All()
{
  return _objectSet.ToList<T>();
}
public void SaveChanges()
{
  _context.SaveChanges();
}

What should I be looking for in this instance?

griegs
  • 22,624
  • 33
  • 128
  • 205
  • You said "Clearly I can't update each entity in turn and then SaveChanges as that would take far too long." Would it work for to call `SaveChanges()` for some arbitrary # of records, perhaps 1024 or 4096? – David Tansey Apr 11 '13 at 05:57
  • Have you looked at the following post on SO [http://stackoverflow.com/questions/5940225/fastest-way-of-inserting-in-entity-framework/5942176#5942176](http://stackoverflow.com/questions/5940225/fastest-way-of-inserting-in-entity-framework/5942176#5942176)? – David Tansey Apr 11 '13 at 05:59
  • @David Yeah I tried that, and that seems to work OK, but I'm not sure this is the best *technical* solution. Know what I mean? – griegs Apr 11 '13 at 05:59
  • Yeah, I feel ya. OUT OF MEMORY is pretty painful though -- any relief seems welcome. – David Tansey Apr 11 '13 at 06:06
  • @griegs I think within the technical boundaries of your project the solution is in Slauma's answer there. There are other technical solutions like stored procedures or basic `SqlCommand`s, each of which won't use many client resources. Would such solutions be an option for you? – Gert Arnold Apr 11 '13 at 07:30
  • I think that in this case you should avoid entity framework. – L-Four Apr 11 '13 at 07:39

1 Answers1

4

Making changes to half a million record through EF within one transaction is not supposed use case. Doing it in small batches is a better technical solution. Doing it on database side through some stored procedure can be even better solution.

I would start by slightly modifying your code (translate it to your repository API yourselves):

using (var readContext = new YourContext()) {
    var set = readContext.CreateObjectSet<ExportOrderSKUData>();

    foreach (var item in set.ToList()) {
       readContext.Detach(item);
       using (var updateContext = new YourContext()) {
          updateContext.Attach(item);
          // make your changes
          updateContext.SaveChanges();
       }
    }
}

This code uses separate context for saving item = each save is in its own transaction. Don't be afraid of that. Even if you try to save more records within one call of SaveChanges EF will use separate roundtrip to database for every updated record. The only difference is if you want to have multiple updates in the same transaction (but having half a million updates in single transaction will cause issues anyway).

Another option may be:

using (var readContext = new YourContext()) {
    var set = readContext.CreateObjectSet<ExportOrderSKUData>();
    set.MergeOption = MergeOption.NoTracking;

    foreach (var item in set) {
       using (var updateContext = new YourContext()) {
          updateContext.Attach(item);
          // make your changes
          updateContext.SaveChanges();
       }
    }
}

This can in theory consume even less memory because you don't need to have all entities loaded prior to doing foreach. The first example probably needs to load all entities prior to enumeration (by calling ToList) to avoid exception when calling Detach (modifying collection during enumeration) - but I'm not sure if that really happens.

Modifying those examples to use some batches should be easy.

Ladislav Mrnka
  • 360,892
  • 59
  • 660
  • 670
  • I like these even if they don't work. I still get an out of memory error. However, I can see the value in this answer for other stuff I may need to do in this project. Thank you – griegs Apr 11 '13 at 21:37
  • 1
    You should probably use memory profiler to find what exactly is causing the issue. If neither of these examples helps with OOM exception there may be some bigger hidden issue. – Ladislav Mrnka Apr 11 '13 at 22:44
  • Agreed. It could simply be that my PC is running out of memory and that it'll cease to be an issue on the server but right now I can't check if that's the case or not. Bit weird here with access. – griegs Apr 11 '13 at 23:00