1

I am working on a console app (C#, asp-core 2.1, Entity Framework Core) which is connected to a local SQL Server database, the default (localdb)\MSSQLLocalDB (SQL Server 2016 v13.0) provided with Visual Studio.

The problem I am facing is that it takes quite a long time to insert data into a table. The table has 400.000 rows, 6 columns, and I insert them 200 at a time.

Right now, the request takes 20 seconds to be executed. And this execution time keeps increasing. Considering the fact that I still have 20.000 x200 rows to insert, it's worth figuring out where does this problem comes from!

A couple of facts :

  • There is no Index on the table
  • My computer is not new but I have a quite good hardware (i7, 16 Go RAM) and I don't hit 100% CPU while inserting

So, my questions are :

  • Is 400 k rows considered to be a 'large' database? I've never worked with a table that big before but I thought it was common to have a dataset like this.
  • How can I investigate where does the inserting time come from? I have only Visual Studio installed so far (but I am opened to other options)

Here is the SQL code of the table in question :

CREATE TABLE [dbo].[KfStatDatas] 
(
    [Id]           INT IDENTITY (1, 1) NOT NULL,
    [DistrictId]   INT           NOT NULL,
    [StatId]       INT           NOT NULL,
    [DataSourceId] INT           NOT NULL,
    [Value]        NVARCHAR(300) NULL,
    [SnapshotDate] DATETIME2(7)  NOT NULL
);

EDIT I ran SQL Server Management Studio, and I found the request that is the slowing down the whole process. It is the insertion request.

But, by looking at the SQL Request create by Entity Framework, it looks like it's doing an inner join and going through the whole table, which would explain why the processing time increases with the table.

I may miss a point but why would you need to enumerate the whole table to add rows?

Raw request being executed :

SELECT [t].[Id] 
FROM [KfStatDatas] t
INNER JOIN @inserted0 i ON ([t].[Id] = [i].[Id])
ORDER BY [i].[_Position]

Request View

EDIT and SOLUTION

I eventually found the issue, and it was a stupid mistake : my Id field was not declared as primary key! So the system had to go through the whole DB for every inserted row. I added the PK and it now takes...100 ms for 200 rows, and this duration is stable.

Thanks for your time!

David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67
XavierAM
  • 1,627
  • 1
  • 14
  • 30

2 Answers2

2

I think you may simply missing an primary key. You've declared to EF that Id is the Entity Key, but you don't have a unique index on the table to enforce that.

And when EF wants to fetch the inserted IDs, without an index, it's expensive. So this query

SELECT t.id from KfStatDatas t
inner join @inserted0 i 
  on t.id = i.id
order by i._Position

performs 38K logical reads, and takes 16sec on average.

So try:

ALTER TABLE [dbo].[KfStatDatas]
ADD CONSTRAINT PK_KfStatDatas
PRIMARY KEY (id)

BTW are you sure this is EF6? This looks more like EF Core batch insert.

David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67
  • Yes you are right, I was editing my question at the same time. It was as stupid as this, I was missing the Primary Key... Sorry for the annoyment. Btw, I thought that it was mandatory to have a PK explicitely declared when doing Code First approach? That's why I was not suspicious. Seemingly not. And yes it's EF Core. – XavierAM Jul 13 '19 at 15:36
  • In Code First, if you let EF create your table it will create the primary key and supporting indexes for your FKs. But if you create the tables yourself, it's not manditory. EF doesn't check that the tables you created have the indexes it expects. The target tables just need to have the columns that EF mapping says they do, which is why you can map to views, or synonyms, etc. – David Browne - Microsoft Jul 13 '19 at 15:41
  • Understood, but that's EF which created this table based on a class (which was missing the Key Attribute). That's why I am a bit unclear. – XavierAM Jul 13 '19 at 18:37
  • It should have discovered the key by convention, and wouldn’t have allowed you to change an entity that it didn’t have an Entity Key. – David Browne - Microsoft Jul 13 '19 at 21:18
0

No 400K rows is not large.

The most efficient way to insert a large number of rows from .NET is with SqlBulkCopy. This should take seconds rather than minutes for 400K rows.

With batching individual inserts, execute the entire batch in a single transaction to improve throughput. Otherwise, each insert is committed individually, requiring a synchronous flush of the log buffer to disk for each insert to harden the transaction.

EDIT:

I see from your comment that you are using Entity Framework. This answer may help you use SqlBulkCopy with EF.

Dan Guzman
  • 43,250
  • 3
  • 46
  • 71
  • OK thanks. Is EF not ok for that kind of insertion? – XavierAM Jul 13 '19 at 14:12
  • @XavierAM, I added a link to a related EF answer that may help. If performance is acceptable with that method then it's perfectly ok. You might need to resort do the lower level API calls when performance is more important than the EF benefits. – Dan Guzman Jul 13 '19 at 14:16
  • Thx. I will have a look but for the time being, I would like to stick to EF syntax. It's really convenient :-) – XavierAM Jul 13 '19 at 15:16