How to make bulk insert work with multiple tables

Question

How with SQL server bulk insert can I insert into multiple tables when there is a foreign key relationship?

What I mean is that the tables are this,

CREATE TABLE [dbo].[UndergroundFacilityShape]
([FacilityID] [int] IDENTITY(1,1) NOT NULL,
[FacilityTypeID] [int] NOT NULL,
[FacilitySpatialData] [geometry] NOT NULL)

CREATE TABLE [dbo].[UndergroundFacilityDetail]
([FacilityDetailID] [int] IDENTITY(1,1) NOT NULL,
[FacilityID] [int] NOT NULL,
[Name] [nvarchar](50) NOT NULL,
[Value] [nvarchar](255) NOT NULL)

So each UndergroundFacilityShape can have multiple UndergroundFacilityDetail. The problem is that the FacilityID is not defined until the insert is done because it is an identity column. If I bulk insert the data into the Shape table then I cannot match it back up the the Detail data I have in my C# application.

I am guessing the solution is to run a SQL statement to find out what the next identity value is and popuplate the values myself and turn off the identity column for the bulk insert? Bear in mind that only one person is going to be running this application to insert data, and it will be done infrequently so we don't have to worry about identity values clashing or anything like that.

I am trying to import thousands of records, which takes about 3 minutes using standard inserts, but bulk insert will take a matter of seconds.

In the future I am expecting to import data that is much bigger than 'thousands' of records.

bulk inserts are for properly built data. If you've got unknown relationships at the time of insert (e.g. your parent/child foreign key values), it's not a candidate for bulk insertion. — Marc B, Oct 17 '11 at 20:53
I have a solution that works at the moment using straight inserts, so any solution would have to be faster than that. That is what my goal is. Otherwise I will just leave it how it is. — peter, Oct 17 '11 at 20:54
what about if you disable/enable the identity before and after the inserts? — Adrian Iftode, Oct 17 '11 at 22:59
what is the file structure? also can you show your insert solution, maybe we can speed that up. — HLGEM, Oct 27 '11 at 15:04

score 1 · Accepted Answer · answered Nov 14 '11 at 20:26

Turns out that this is quite simple. Get the current identity values on each of the tables, and populate them into the DataTable myself incrementing them as I use them. I also have to make sure that the correct values are used to maintain the relationship. That's it. It doesn't seem to matter whether I turn off the identity columns or not.

I have been using this tool on live data for a while now and it works fine.

It was well worth it though the import takes no longer than 3 seconds (rather than 3 minutes), and I expect to receive larger datasets at some point.

So what about if more than one person uses the tool at one time? Well yes I expect issues, but this is never going to be the case for us.

So from the time you fetch the last Id assigned to the header table to the time you complete your update, no other process inserts records into the header table? If you cannot guarantee that, you can get a key collision. — Eric J., Mar 26 '12 at 19:13
As Eric J hinted, you've got yourself a race-condition there. — CAD bloke, May 08 '15 at 03:07

score 0 · Answer 2 · answered Oct 27 '11 at 14:54

Peter, you mentioned that you already have a solution with straight INSERTs.

If the destination table does not have a clustered index (or has a clustered index and is empty), just using the TABLOCK query hint will make it a minimally-logged transaction, resulting on a considerable speed up.

If the destination table has a clustered index and is not empty, you can also enable trace flag 610 in addition to the TABLOCK query hint to make it a minimally-logged transaction.

Check the "Using INSERT INTO…SELECT to Bulk Import Data with Minimal Logging" section on the INSERT MSDN page.

How to make bulk insert work with multiple tables

2 Answers2

Linked