0

I am in the process of migrating an old legacy system that has mostly flat database tables with way too many columns. Each new setting requires a new column etc so the tables are getting ridiculously big. I am trying to change this structure to a relational one and am struggling to migrate the old existing data into the new structure. Here is an example of the old table we have:

CREATE TABLE [dbo].[User_OLD](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [FullName] [nvarchar](50) NULL, 
    [Email] [nvarchar](50) NULL,
    [Setting1] [bit] NULL,
    [Setting1Value] [int] NULL,
    [Setting2] [bit] NULL,
    [Setting2Value] [int] NULL,
    [Setting3] [bit] NULL,
    [Setting3Value] [int] NULL,
    CONSTRAINT [PK_User] PRIMARY KEY CLUSTERED ([Id] ASC));

This table is being migrated into multiple tables, here is an example:

CREATE TABLE [dbo].[User_NEW](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [FullName] [nvarchar](50) NULL,
    [Email] [nvarchar](50) NULL,
    CONSTRAINT [PK_User] PRIMARY KEY CLUSTERED ([Id] ASC));

CREATE TABLE [dbo].[UserSetting](
    [Id] [int] IDENTITY(1,1) NOT NULL,  
    [UserId] [int] NOT NULL,    
    [SettingName] [varchar](250) NOT NULL,  
    [SettingValue] [varchar](250) NOT NULL,         
    [CreatedOn] [datetime] NOT NULL,    
    CONSTRAINT [PK_UserSetting] PRIMARY KEY CLUSTERED ([Id] ASC),
    CONSTRAINT FK_UserSetting_User FOREIGN KEY ([UserId]) REFERENCES User_NEW(Id));

So the issue is that I need to take a record from User_OLD and insert it's values into User_NEW, I then need to take the User_NEW.Id and insert it into the UserSetting table with the corresponding Setting1, Setting1Value going into the new tables columns SettingName and SettingValue.

If you can help me with a script that could achieve that I would be very appreciative!

cullimorer
  • 755
  • 1
  • 5
  • 23
  • Find my "granadacoder" answer at this question if you want to do it set-based : https://stackoverflow.com/questions/17662432/insert-identity-column-value-into-table-from-another-table – granadaCoder Mar 26 '19 at 14:06

4 Answers4

2

The best way to handle this is not using SCOPE_IDENTITY or some other approach that forces you to use RBAR for this. A slight tweak to your new table makes this a bit simpler.

CREATE TABLE [dbo].[User_NEW](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [FullName] [nvarchar](50) NULL,
    [Email] [nvarchar](50) NULL,
    UserID_OLD int not null
    CONSTRAINT [PK_User] PRIMARY KEY CLUSTERED ([Id] ASC));

Notice the new column UserID_OLD? You can insert all your users in a single statement and then start the normalization process by joining to this table.

insert User_NEW
(
    FullName
    , Email
    , UserID_OLD
)
select FullName
    , Email
    , Id
from User_OLD

insert UserSetting
(
    UserId
    , SettingName
    , SettingValue
)
select Setting1
    , Setting1Value
    , un.Id
from User_OLD u
join User_NEW un on un.UserID_OLD = u.Id

Then you just repeat this insert for all the attribute/value combinations. And drop the UserID_OLD column after the migration is completed.

However you need to realize that what you have is an entity attribute value design and there are LOTS of pitfalls with this type of thing. For one, you have now stored everything in a varchar so you have no way to validate data at the database level. You also have a performance timebomb here because everything will constantly have to be converted to the proper datatype. And you have to be careful with your conversions or you will get conversion errors. The EAV pattern seems really awesome but in practice it very often is problematic.

Sean Lange
  • 33,028
  • 3
  • 25
  • 40
  • This looks like a really cool way of doing it, thanks! Thanks for the info on EAV, I wasn't aware of the pattern name. I know it's not the most ideal scenario but it's time dependent project. Out of curiosity, what are the alternative pattern ideas? – cullimorer Mar 26 '19 at 14:50
  • It is an extremely powerful approach when done as a hybrid. Mostly relational data with a touch of EAV here and there. The biggest issue I see with your direction is you have lost the datatype from the denormalized table. – Sean Lange Mar 26 '19 at 15:33
  • Ah yep, this is actually only an extremely simplified version of what I'm doing. I made this just to get the answer to how I split out the data into the separate tables. It isn't an ideal solution that I've found through EAV but it's going to be the quickest to engineer and will see at least the benefit of not having to change the DB every time a new field is added. Thanks for your assistance. – cullimorer Mar 26 '19 at 16:08
  • Upvote for this "set based" answer. Other answer(ers), PLEASE STOP GIVING RBAR SOLUTIONS. – granadaCoder Mar 26 '19 at 17:24
1

You can get the newly inserted id by using the output clause

declare @OutputTbl table (ID INT)
declare @NewUserID int

insert into User_NEW (FullName, Email)
output inserted.Id into @OutputTbl(ID)
VALUES ('john doe', 'john@somewhere.com')

select @NewUserID = ID from @OutputTbl

now you can use @NewUserID as key for all your inserts into client tables

Beware of using SCOPE_IDENTITY()
It could give you another ID then you expect, this can happen when there is a trigger for example that inserts into another table.
Also, by using the output clause you can capture more fields then just the Id

See also this
@@IDENTITY, SCOPE_IDENTITY(), OUTPUT and other methods of retrieving last identity

GuidoG
  • 11,359
  • 6
  • 44
  • 79
  • What bugs with SCOPE_IDENTITY? I don't think there any known bugs with any of those features. But often people don't understand @@IDENTITY and use it incorrectly. – Sean Lange Mar 26 '19 at 13:56
  • @SeanLange The answer in the post I linked mentions bugs, that is why i wrote this. I agree its not a bug when you get the ID of a row inserted by a trigger in another table but its just how this function works. I will change my answer – GuidoG Mar 26 '19 at 13:58
  • Interesting. The bug mentioned is a link to MS Connect which no longer exists. Will have to investigate a bit on this. Never heard of it until today. :) Seems it is when parallelism comes into play. The output statement would be the way to go anyway. :) – Sean Lange Mar 26 '19 at 14:03
  • I'm glad you posted this. I have used SCOPE_IDENTITY many times in the past without issue but I was unaware of the potential issues. This is the first time that I have see the OUTPUT clause. – Hopper Mar 26 '19 at 14:23
  • @Hopper Where all here to learn, I learn new stuff here every week. Glad I could help you – GuidoG Mar 26 '19 at 14:32
  • @SeanLange `The output statement would be the way to go anyway` I am glad we agree on this :) – GuidoG Mar 26 '19 at 14:52
0

If you are looking for a process that will allow you to get the key from a newly inserted row so that you can use it in subsequent code, then this is what you are looking for - SCOPE_IDENTITY().

https://learn.microsoft.com/en-us/sql/t-sql/functions/scope-identity-transact-sql?view=sql-server-2017

Declare PassedKey Integer; 

INSERT INTO User_New values (...) 
SET PassedKey = SCOPE_IDENTITY()

INSERT INTO User_Setting values (PassedKey, ...)
Hopper
  • 146
  • 6
0
    DECLARE @FullName nvarchar(50), 
    @Email nvarchar(50),
    @Setting1 bit,
    @Setting1Value int,
    @Setting2 bit,
    @Setting2Value int,
    @Setting3 bit,
    @Setting3Value int 

    DECLARE @recentId INT

    DECLARE C CURSOR FOR 
        SELECT [FullName] , [Email] ,[Setting1], [Setting1Value], [Setting2], [Setting2Value], [Setting3], [Setting3Value]
        FROM @User_OLD

        OPEN C
    FETCH NEXT FROM C INTO @FullName, @Email ,@Setting1 ,@Setting1Value ,@Setting2 ,@Setting2Value ,@Setting3 ,@Setting3Value  

    WHILE @@FETCH_STATUS = 0 
    BEGIN

        INSERT INTO @User_NEW(FullName, Email)
        VALUES(@FullName, @Email);

        SELECT @recentId = SCOPE_IDENTITY()

        INSERT INTO @UserSetting(UserId, SettingName, SettingValue, CreatedOn)
        VALUES(@recentId, @Setting1 , @Setting1Value , GETDATE())

        INSERT INTO @UserSetting(UserId, SettingName, SettingValue, CreatedOn)
        VALUES(@recentId, @Setting2 , @Setting2Value , GETDATE())

        INSERT INTO @UserSetting(UserId, SettingName, SettingValue, CreatedOn)
        VALUES(@recentId, @Setting3 , @Setting3Value , GETDATE())

        FETCH NEXT FROM C INTO @FullName, @Email ,@Setting1 ,@Setting1Value ,@Setting2 ,@Setting2Value ,@Setting3 ,@Setting3Value 

    END

    CLOSE C
    DEALLOCATE C    
Hasan Mahmood
  • 978
  • 7
  • 10