0

I'm writing an import process which will import data from one (somewhat legacy) database to another. The import process takes one flat table with the source data. I have this populating a temp table (#SourcePersonAccount) at the start. The goal is to distribute this data into three destination tables (dbo.Person, dbo.Account & dbo.PersonAccount). This runs within a trigger on a table use SQL Server Replication, so needs to run quickly.

#SourcePersonAccount([AccountNumber], [CompanyId], [TargetPersonId], [TargetAccountId]);
dbo.Person ([Id] pk identity(1,1), [CompanyId], ...);
dbo.Account ([Id] pk identity(1,1), [AccountNumber], ...);
dbo.PersonAccount ([Id], [PersonId] fk_Person_Id, [AccountId] fk_Account_Id);

In my code, I have the TargetPersonId already populated in the #SourcePersonAccount temp table. All that's left is to 1) insert into dbo.Account, 2) update #SourcePersonAccount with the inserted dbo.Account.Id value, 3) insert into dbo.PersonAccount.

One of the challenges is that the AccountNumber and CompanyId make up a composite primary key of the source table, so both are needed to join properly on the #SourcePersonAccount temp table.

I have seen threads addressing similar issues to a certain extent here and here which did not solve my particular problem, mostly due to performance issues.

As stated in this post, the OUTPUT clause cannot output columns that were not included in the insert, so that is not an option here.

One solution I saw that technically can give the desired output (I can't find the link to where I found the suggestion) while using the OUTPUT clause is to actually add and drop a column within the query.

DECLARE @PersonAccountTbl TABLE ([AccountId] INT, [AccountNumber] INT, [CompanyId] INT);

ALTER TABLE [dbo].[Account]
    ADD [CompanyId] INT NULL;

INSERT INTO [dbo].[Account]
    ([AccountNumber], [CompanyId])
OUTPUT INSERTED.[Id], INSERTED.[AccountNumber], INSERTED.[CompanyId]
INTO @PersonAccountTbl
SELECT
    [AccountNumber], [CompanyId]
FROM #SourcePersonAccount
WHERE
    [TargetAccountId] IS NULL;

ALTER TABLE [dbo].[Account]
    DROP COLUMN [CompanyId];

This is not a viable option for my situation.

I tried using MERGE as every thread I've found on this issue recommends using it. I do not like MERGE for a few reasons. I tried it anyways; the below code gives the desired output, but ended up being much too slow for my purposes.

DECLARE @PersonAccountTbl TABLE ([AccountId] INT, [AccountNumber] INT, [CompanyId] INT);

MERGE INTO [dbo].[Account] a
USING #SourcePersonAccount spa
    ON spa.[TargetAccountId] IS NULL
WHEN NOT MATCHED THEN
    INSERT
        ([AccountNumber])
    VALUES
        (spa.[AccountNumber])
    OUTPUT INSERTED.[Id], INSERTED.[AccountNumber], spa.[CompanyId]
    INTO @PersonAccountTbl ([AccountId], [AccountNumber], [CompanyId]);

UPDATE spa
    SET spa.[TargetAccountId] = pat.[AccountId]
FROM #SourcePersonAccount spa
JOIN @PersonAccountTbl pat
    ON pat.[AccountNumber] = spa.[AccountNumber]
    AND pat.[CompanyId] = spa.[CompanyId];

INSERT INTO [dbo].[PersonAccount]
    ([PersonId], [AccountId])
SELECT
    spa.[TargetPersonId], spa.[TargetAccountId]
FROM #SourcePersonAccount spa
LEFT JOIN [dbo].[PersonAccount] pa
    ON pa.[PersonId] = spa.[TargetPersonId]
    AND pa.[AccountId] = spa.[TargetAccountId]
WHERE
    pa.[Id] IS NULL;

Is there a way other than MERGE or adding/dropping a column to accomplish this?

Pleebo
  • 163
  • 3
  • 13

1 Answers1

1

You can use a SEQUENCE instead of an IDENTITY column. Then you can assign the IDs to a temp table or table variable before you INSERT the data.

David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67