0

I have a data table that contains a name and a social security number. I want to insert the name into a table with an identity field, then insert the ssn with that new identity field value into another table.

Below are the tables:

CREATE TABLE [data_table] 
(
    [name] [varchar](50) NOT NULL,
    [ssn] [varchar](9) NOT NULL,
)

CREATE TABLE [entity_key_table] 
(
    [entity_key] [int] IDENTITY(1000000,1) NOT NULL,
    [name] [varchar](50) NOT NULL,
)

CREATE TABLE [entity_identifier_table] 
(
    [entity_identifier_key] [int] IDENTITY(1000000,1) NOT NULL,
    [entity_key] [int] NOT NULL,
    [ssn] [int] NOT NULL,
)

This query works but doesn't link entity_key in [entity_key_table] TO ssn in [entity_identifier_table]:

INSERT INTO entity_key_table (name)
OUTPUT [INSERTED].[entity_key]
INTO [entity_identifier_table] (entity_key)
    SELECT [name]
    FROM [data_table]

This is what I want to do, but it doesn't work.

INSERT INTO entity (name)
OUTPUT [INSERTED].[entity_key], [data_table].[ssn]
INTO [entity_identifier] (entity_key,ssn)
    SELECT [name]
    FROM [data_table]
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • For what database? Most have a mechanic to safely fetch the last generated ID, e.g. scope_identity() in T-SQL – Alex K. Jul 26 '17 at 15:06
  • SQL Server 2012. I have tried SCOPE_IDENTITY but couldn't get it to work. I will try again. Actually, OUTPUT is working, but I can't link it to the SSN in the source table. – user2463829 Jul 26 '17 at 15:19
  • Take a look at https://stackoverflow.com/questions/1920558/what-is-the-difference-between-scope-identity-identity-identity-and-ide – Xedni Jul 26 '17 at 15:20
  • OUTPUT is working in capturing the last IDENTITY value. My problem is associating that value with the SSN from the source record. – user2463829 Jul 26 '17 at 15:23
  • I'm looking into a possible solution, but I feel obligated to point out that having SSN in two places isn't properly normalized. Maybe that doesn't matter to you, but I thought I'd mention it. – Xedni Jul 26 '17 at 15:25
  • That's just the source data from a file. I said it was in a table just to make it simpler. – user2463829 Jul 26 '17 at 15:36
  • 1
    Not an answer to your question but you seriously need to reconsider storing SSN in clear text. That is actually illegal in some states and is absolutely negligent anywhere. And your datatype isn't large enough to store all SSN values anyway as they have started adding a character at the end to some of them. – Sean Lange Jul 26 '17 at 15:55
  • To agree with @Sean on encrypting, make sure to encrypt in a database trigger so that any changes to the filed no matter the source are encrypted. – HLGEM Jul 26 '17 at 17:34

2 Answers2

0

Rewriting my answer based on your requirements and the articles you linked. I think you can get that behavior doing something like this. I admit, I have never seen a merge on something like 1 != 1 like the article suggests, so I would be very cautious with this and test the bajeezes out out of it.

FWIW, it looks like during an INSERT, you can't access data that's not in the inserted virtual table, but updates (and apparently MERGE statements) can.

if object_id('tempdb.dbo.#data_table') is not null drop table #data_table
create table #data_table
(
    [name] [varchar](50) NOT NULL,
    [ssn] [varchar](9) NOT NULL,
)

if object_id('tempdb.dbo.#entity_key_table') is not null drop table #entity_key_table
create table #entity_key_table
(
    [entity_key] [int] IDENTITY(1000000,1) NOT NULL,
    name varchar(50)
)

if object_id('tempdb.dbo.#entity_identifier_table') is not null drop table #entity_identifier_table
create table #entity_identifier_table
(
    [entity_identifier_key] [int] IDENTITY(2000000,1) NOT NULL,
    [entity_key] [int] NOT NULL,
    [ssn] varchar(9) NOT NULL,
)
insert into #Data_table (Name, SSN)
select 'John', '123456789' union all
select 'John', '001100110' union all
select 'Jill', '987654321' 



merge into #entity_key_table t
using #data_table s
    on 1 != 1
when not matched then insert
(
    name   
)
values
(
    s.name
)
output inserted.entity_key, s.ssn
into #entity_identifier_table
(
    entity_key,
    ssn
);

select top 1000 *
from #data_table

select top 1000 *
from #entity_key_table

select top 1000 *
from #entity_identifier_table
Xedni
  • 3,662
  • 2
  • 16
  • 27
  • That looks very promising. I will work with it and let you know. – user2463829 Jul 26 '17 at 15:38
  • You answered the question I asked. Thank you very much. The problem is I asked the wrong question. There is actually no name field in the source data. There are fields whose values can be the same among records, so there is no unique field value to join on. I could do this row-by-row easily, but I don't see a way to do it by set. Row-by-row is, of course, the least desirable method. – user2463829 Jul 26 '17 at 15:53
  • Hah, I was afraid you might say something like that. Let me see if I can make some tweaks. – Xedni Jul 26 '17 at 16:03
  • I don't think there's a way to do what you've described given your table layout. My first suggestion would be don't de-normalize `SSN` (and in fact @SeanLange makes some very good points about potential legal issues as well). My second suggestion would be to create/store a unique key for each name/SSN, even if it's just another identity. Then, if you store THAT in `entity_key_table` instead of the non-unique `name` field, joining back the data on THAT field should work just fine. – Xedni Jul 26 '17 at 16:18
  • I think this article is similar to what I want to do. http://sqlserverplanet.com/sql-2008/using-the-merge-statements-output-clause-to-match-inserted-identity-fields-with-source-table-fields On second thought, probably not. – user2463829 Jul 26 '17 at 16:21
  • This one is getting closer. https://www.mssqltips.com/sqlservertip/3687/merge-parent-and-child-sql-server-tables-that-use-identity-keys/ – user2463829 Jul 26 '17 at 16:28
  • Actually, we are generating another identity value in the table with the SSN. I am working on existing tables that have been fully legally vetted. I am just trying to get data into them. Once I get this insert working, everything else is ordinary SQL. – user2463829 Jul 26 '17 at 16:35
  • Updated my post. Let me know if I hit on what you're trying to do (note, I made the other identity start at 200000, just so I can distinguish one from the other in this test harness) – Xedni Jul 26 '17 at 16:38
  • I am checking it out now. Thanks! – user2463829 Jul 26 '17 at 16:39
  • That is absolutely what I wanted to do. I really appreciate your help. I can't believe you came up with it so fast, or at all for that matter. Thanks again! – user2463829 Jul 26 '17 at 16:49
  • Glad I could help – Xedni Jul 26 '17 at 16:50
  • It worked like a charm! All I did was substitute the real table and field names. I can't thank you enough. – user2463829 Jul 26 '17 at 16:58
0

The problem with your code is that you output data only from inserted or deleted.

Assuming your name column only relates to one SSN, the following would work:

DECLARE @output TABLE (entity_key INT,ssn VARCHAR (11))

INSERT INTO entity (entity_key, name)
OUTPUT [INSERTED].[entity_key], [inserted].[name]
INTO @output 
SELECT D.Entity_key, d.name 
FROM datatable

INSERT INTO entity_identifier (entity_key, ssn)
Select o.entity_key, d.snn
from @output o
join datatable d on o.name = d.name

However, the problem of multiple duplicated names having different Social Security Numbers is extremely high. In this case, your current structure simply does not work because there is no way to know which identity belongs to which name record. (The Merge solution in another post may also have this problem, before you put that to production be sure to test the scenario of duplicated names. The chances of duplicated names in a set of records is extremely high in any reasonable large data set of names and this should be one of your unit test for any potential solution.)

Here is a potential workaround. First, insert the SSN as the name in the first insert, then return output as shown but join on the @output Name column to the SSN column. After doing the other insert, then update the name in the orginal table to the correct name again joining on the SSN data.

DECLARE @output TABLE (entity_key INT,ssn VARCHAR (11))

INSERT INTO entity (entity_key, name)
OUTPUT [INSERTED].[entity_key], [inserted].[ssn]
INTO @output 
SELECT D.Entity_key, d.name 
FROM datatable

INSERT INTO entity_identifier (entity_key, ssn)
Select o.entity_key, d.output
from @output o


update e
set name = d.name
FROM entity e
join @output o on e.entity_key = o.entity_key
join datatable d on o.name = d.ssn
HLGEM
  • 94,695
  • 15
  • 113
  • 186
  • You indeed answered the question I asked, but I asked the wrong question. There is actually no unique field I can join on. I did get the answer, though, from Xedni. It turns out that OUTPUT can't retrieve anything from the source table, but MERGE evidently can. Thanks for taking the time, and I apologize for asking the wrong question. – user2463829 Jul 26 '17 at 18:19
  • Yes but make sure to test the duplicate records to make sure it is in fact matching them correctly. Quite frankly I find that merge is hard to maintain and causes more problems when there are duplicates than any other technique. – HLGEM Jul 26 '17 at 20:09