0

I'm working on some bulk inserts with Entity Framework Core. To minimize round trips to the database, the new inserts are batched in groups of 100 before being added to the database context and saved using SaveChanges().

The current problem is that if any of the batch fail to insert because of, e.g., unique key violations on the table, the entire transaction is rolled back. In this scenario it would be ideal for it to simply discard the records that were unable to be inserted, and insert the rest.

I'm more than likely going to need to write a stored procedure for this, but is there any way to have Entity Framework Core skip over rows that fail to insert?

Kal Zekdor
  • 1,172
  • 2
  • 13
  • 23
  • Here are two posting that you may want to read. I'm not suggesting any one as a solution. Only that they are related to your issue : https://www.sqlteam.com/articles/introduction-to-transactions and https://stackoverflow.com/questions/22486489/entity-framework-6-transaction-rollback – jdweng Apr 10 '21 at 20:45
  • @jdweng Thanks, but the problem is that I don't want the transaction to rollback on failure, ideally I'd like it to commit any rows it was able to and discard the rest. – Kal Zekdor Apr 10 '21 at 21:14
  • You only want one row to rollback, not all the rows. – jdweng Apr 10 '21 at 22:49
  • Your insert must include a `where not exists` check, I don't know if that's possible with your EF, if not the best approach would be a stored procedure that you pass the batch of data to, either using serialized json or a data table into a custom data type. – Stu Apr 10 '21 at 23:01

2 Answers2

2

In your stored procedure, use a MERGE statement instead of an INSERT and then only use the WHEN NOT MATCHED

MERGE  @tvp  incoming
INTO targetTable existing WITH (HOLDLOCK)
   ON (incoming.PK = existing.PK)
WHEN NOT MATCHED
    INSERT

The records that match will be discarded. The @tvp is the Table Valued Parameter that is being given to the stored proc from your app code.

There are locking considerations when using the MERGE statement that may or may not apply to your scenario. It's worth reading up on concurrency and atomicity for it to make sure you cover the rest of your bases.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
StingyJack
  • 19,041
  • 10
  • 63
  • 122
0

if you are going to the stored procedure then you can declare TVPs. In C# when you will fill the TVP it will fail and you will know in the catch that it failed then to go with recursion of this 100 rows.

Recursive function It will break the N row into n/2 and will again call the TVP filling. If the first set is ok it will proceed and the second set will fail. The set which will fail you can simply call this recursive function again on that. It will keep your safe records in TVP and failed records seperately. You can call this recursive function up to X. Where X is a number such as 5,6,7. After that, you will be having only bad records.

If you need to know about TVP you can see this

Note : you can not use Parallel Query execution for this approach.

  • going back and forth from the application code to the database to do this is going to be slow, maybe even slower than just doing singular inserts. – StingyJack Apr 11 '21 at 01:39
  • @StingyJack I was aware of the approach I gave but I find your answer more suitable to this. I was not aware of the Merge Statement and never used that, Thanks for teaching and telling us another SQL Term. – Naveed Yousaf Apr 12 '21 at 21:40