0

I am trying to insert a ton of data into a SQL server database and my question is, how often i should save when millions of entries are going to be inserted there. Is it efficient to save it after e.g. 10.000 inserts or after 1.000.000 inserts? Are there any go-to values? I am using C# Entity Framework to populate the server with the data, so i am using a DBContext class which handles the CRUD and saving.

Julian Peil
  • 105
  • 10
  • what do you mean by "save it" exactly? Once you run an INSERT command the data for that row (or rows) is in the database, unless you're using transactions, in which case it waits until you commit the transaction. How are you inserting the data, exactly? Or are you talking about making a backup, perhaps? Also why do you think changing this "save" point might affect efficiency? In what way? What do you expect to gain? It's not at all clear what you are asking, I'm afraid. – ADyson Aug 10 '18 at 12:32
  • This is a very broad question. Based on your question, it doesn't sound like SQL Server is the only piece in the processing chain. SQL Server is designed to be highly efficient, but it also has to know what you want it to do efficiently so database design is a major factor (table structure, indexing, etc). There's really no way we can answer this for you without knowing the architecture of your application. – squillman Aug 10 '18 at 12:33
  • Ah so you're maybe talking about using SaveChanges() in entity framework? That effectively commits a transaction in the background. I don't think it would affect "efficiency" much, although that in itself is a hugely broad term. If you're doing such a bulk load I assume you're not worrying about concurrent access at this point, or are you? If you want to make your load process faster in general though, just don't use Entity Framework at all, use a suitable bulk loading tool – ADyson Aug 10 '18 at 12:48
  • https://ericlippert.com/2012/12/17/performance-rant/ – mjwills Aug 10 '18 at 12:54
  • Don't forget to set AutoDetectChangesEnabled = false; on the DbContext. See here http://stackoverflow.com/questions/5943394/why-is-inserting-entities-in-ef-4-1-so-slow-compared-to-objectcontext/5943699#5943699 – Gauravsa Aug 10 '18 at 12:55
  • Thank you guys for your answers and suggestions! – Julian Peil Aug 10 '18 at 13:25

1 Answers1

3

I am using C# Entity Framework to populate the server with the data

Your problems start here. It is not suitable for this. WRONG TOOL. I like Entity Frameowrk, but it is an ORM, not a bulk loading tool.

I generally use SqlBulkLoader class to load up to 65k records to a temporary table (in seconds, but beware, you pretty much must use a temporary table) which I the ncopy to the final table in one command / transaction.

Handling 4-5 threads parallel with full speed.

Entity Framework will take hours for a million rows, regardless how you turn it. Just the wrong tool.

TomTom
  • 61,059
  • 10
  • 88
  • 148
  • Also look at [SSIS](https://learn.microsoft.com/en-us/sql/integration-services/sql-server-integration-services?view=sql-server-2017) and [Bulk Copy](https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/bulk-copy-operations-in-sql-server) – TGnat Aug 10 '18 at 12:49