I'm writing an import process which will import data from one (somewhat legacy) database to another. The import process takes one flat table with the source data. I have this populating a temp table (#SourcePersonAccount
) at the start. The goal is to distribute this data into three destination tables (dbo.Person
, dbo.Account
& dbo.PersonAccount
). This runs within a trigger on a table use SQL Server Replication, so needs to run quickly.
#SourcePersonAccount([AccountNumber], [CompanyId], [TargetPersonId], [TargetAccountId]);
dbo.Person ([Id] pk identity(1,1), [CompanyId], ...);
dbo.Account ([Id] pk identity(1,1), [AccountNumber], ...);
dbo.PersonAccount ([Id], [PersonId] fk_Person_Id, [AccountId] fk_Account_Id);
In my code, I have the TargetPersonId
already populated in the #SourcePersonAccount
temp table. All that's left is to 1) insert into dbo.Account
, 2) update #SourcePersonAccount
with the inserted dbo.Account.Id
value, 3) insert into dbo.PersonAccount
.
One of the challenges is that the AccountNumber
and CompanyId
make up a composite primary key of the source table, so both are needed to join properly on the #SourcePersonAccount
temp table.
I have seen threads addressing similar issues to a certain extent here and here which did not solve my particular problem, mostly due to performance issues.
As stated in this post, the OUTPUT
clause cannot output columns that were not included in the insert, so that is not an option here.
One solution I saw that technically can give the desired output (I can't find the link to where I found the suggestion) while using the OUTPUT
clause is to actually add and drop a column within the query.
DECLARE @PersonAccountTbl TABLE ([AccountId] INT, [AccountNumber] INT, [CompanyId] INT);
ALTER TABLE [dbo].[Account]
ADD [CompanyId] INT NULL;
INSERT INTO [dbo].[Account]
([AccountNumber], [CompanyId])
OUTPUT INSERTED.[Id], INSERTED.[AccountNumber], INSERTED.[CompanyId]
INTO @PersonAccountTbl
SELECT
[AccountNumber], [CompanyId]
FROM #SourcePersonAccount
WHERE
[TargetAccountId] IS NULL;
ALTER TABLE [dbo].[Account]
DROP COLUMN [CompanyId];
This is not a viable option for my situation.
I tried using MERGE
as every thread I've found on this issue recommends using it. I do not like MERGE
for a few reasons. I tried it anyways; the below code gives the desired output, but ended up being much too slow for my purposes.
DECLARE @PersonAccountTbl TABLE ([AccountId] INT, [AccountNumber] INT, [CompanyId] INT);
MERGE INTO [dbo].[Account] a
USING #SourcePersonAccount spa
ON spa.[TargetAccountId] IS NULL
WHEN NOT MATCHED THEN
INSERT
([AccountNumber])
VALUES
(spa.[AccountNumber])
OUTPUT INSERTED.[Id], INSERTED.[AccountNumber], spa.[CompanyId]
INTO @PersonAccountTbl ([AccountId], [AccountNumber], [CompanyId]);
UPDATE spa
SET spa.[TargetAccountId] = pat.[AccountId]
FROM #SourcePersonAccount spa
JOIN @PersonAccountTbl pat
ON pat.[AccountNumber] = spa.[AccountNumber]
AND pat.[CompanyId] = spa.[CompanyId];
INSERT INTO [dbo].[PersonAccount]
([PersonId], [AccountId])
SELECT
spa.[TargetPersonId], spa.[TargetAccountId]
FROM #SourcePersonAccount spa
LEFT JOIN [dbo].[PersonAccount] pa
ON pa.[PersonId] = spa.[TargetPersonId]
AND pa.[AccountId] = spa.[TargetAccountId]
WHERE
pa.[Id] IS NULL;
Is there a way other than MERGE
or adding/dropping a column to accomplish this?