0

I am trying to take all the ids from a table of 1.3 million rows and insert them into another table in another database using Entity Framework. When I do this with SSIS, it is done in under 5 minutes, when I do it using Entity Framework, it takes 3.5 hours.

I have read several sources on slow EF insertion and they seemed to indicate that I should turn off Auto Detect Changes Enabled and that I should avoid using loops I have tried with and without loops and it still takes a very long time.

My code without loops is as follows:

_DB2context.Configuration.AutoDetectChangesEnabled = false;
_DB2context.Table2.AddRange(_DB1context.Table1.Select(m => m.Id)
                                              .Select(pen => new Table2() { Checked = false, Id = pen }));
await _DB2context.SaveChangesAsync();

The third line takes about 3.5 hours to execute. When I tried to do this with loops my code was:

var idList = _DB1context.Table1.Select(m => m.Id).ToList();                
int i = 0;

foreach (var id in idList)
{
    i++;

    _DB2context.Table2.Add(new Table2() {Checked = false, Id = id});

    if ((i % 10000)==0)
    {
        await _DB2context.SaveChangesAsync();
    }
}

Is there anything obvious I am doing wrong and is there any way to make this behave more like ssis in running this?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Anglefroghammer
  • 238
  • 2
  • 15
  • How many records does it usually insert? – Ali Bahrami Jul 01 '19 at 09:47
  • Have you considered using `SqlBulkCopy`? – mjwills Jul 01 '19 at 09:47
  • 1.3 million rows. No, I've not consided SqlBulkCopy I'll look into that. – Anglefroghammer Jul 01 '19 at 09:52
  • SqlBulkCopy looks like it would work but it's not using entity framework. I have a lot of mock database tests that rely on using entity framework code for the tests so I'd rather have the code work within entity framework if possible. – Anglefroghammer Jul 01 '19 at 09:57
  • You can have it fast, or you can use EF. Pick one. `SqlBulkCopy`, in my experience, is usually at least 100x faster with large volumes of data. – mjwills Jul 01 '19 at 10:01
  • This is my experience as well which is why I am asking the community to see if they know a way to make ef faster. Sql Bulk Copy may be what I do if what you are saying is 100% correct. I would be okay if the process took an hour but 3.5 hours is far too long. – Anglefroghammer Jul 01 '19 at 10:16
  • 1
    I think this is comparing apples and oranges. SSIS is designed to perform bulk operations. EF isn't. The comparison aside, there are numerous posts on how to improve EF's insert behavior. For instance, [this classic one](https://stackoverflow.com/a/5942176/861716), which is a duplicate if you ask me. – Gert Arnold Jul 01 '19 at 15:13
  • Thanks for posting that, I have used that many times in the past but in my looped version I am only saving every 10,000 records and while I am not recreating the context according to that post this was taking them 200 seconds to do 560,000 records. So the fact that it is taking me 3.5 hours seems a bit off to me so I wanted to be sure that I am not doing something wrong. – Anglefroghammer Jul 01 '19 at 15:29
  • 1
    I think recreating the context will have a huge effect as it did in the post I linked. Also, you should keep using AddRange in the loop. – Gert Arnold Jul 02 '19 at 07:23
  • I am going to have a go recreating 100 at a time while recreating as this is the fastest time they have. I will also save as a range as suggested. – Anglefroghammer Jul 02 '19 at 09:41
  • Getting 7 minutes for 100k which is a lot better, should take about 91 minutes. – Anglefroghammer Jul 02 '19 at 10:18
  • Ended up getting 1 hour 15 minutes, not perfect but good enough for my purposes. – Anglefroghammer Jul 02 '19 at 11:25

1 Answers1

1

Entity Framework performs one database round-trip for every entity to save. So if you have 1.3 million rows to insert, it will make 1.3 million database round-trip which is INSANELY slow.

As @mjwills already pointed, you can use SqlBulkCopy to get optimal performance.


Disclaimer: I'm the owner of Entity Framework Extensions

This library is not free but allows you to perform all bulk operations including BulkSaveChanges and BulkInsert:

  • Bulk SaveChanges
  • Bulk Insert
  • Bulk Delete
  • Bulk Update
  • Bulk Merge

Example

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);

// Customize Bulk Operations
context.BulkInsert(customers, options => {
   options => options.IncludeGraph = true;
});
context.BulkMerge(customers, options => {
   options.ColumnPrimaryKeyExpression = 
        customer => customer.Code;
});
Jonathan Magnan
  • 10,874
  • 2
  • 38
  • 60
  • I don't mind a bit of self promotion when it fits the bill as this does however if it is a choice between several hundred dollars a year and slightly compromising on principles I think my boss is going to choose the latter. I hope that your product eventually becomes part of the entity framework core product. – Anglefroghammer Jul 01 '19 at 11:26