1

I am attempting to create an archive table for a table that I've built in SQL Server. Using an SSIS package I'd like to record each change to the table and add this to the archive table. If successful I should be able to both query the archive and build the table that existed at a specific date range, and be able to query the table to see a list of changes in a date range. In the same SSIS package that updates my report table from the data sources I am using, I have a derived column task which will insert a BOOLEAN value, a transaction date, and I'd like to add a unique integer id for each transaction. There is not a clear way to do this and I'm wondering if someone could inform me?

For reference my model for this project is:

Source DATA --> SSIS --> Report Table --> Archival Table

I cannot use CDC as that is not suported in the standard version of SQL server. Here is what my derived column task looks like now. I am wondering what kind of expression I could use to get a unique primary key value as the Tx_ID

enter image description here

Hadi
  • 36,233
  • 13
  • 65
  • 124
Calvin Ellington
  • 713
  • 3
  • 11
  • 21
  • Why not just add an IDENTITY column to the archive table? – Tab Alleman Feb 12 '18 at 21:13
  • All tables in this workflow are initially created the first time the SSIS package runs. I'm trying to set it up so that there is no manual data entry or table creation required so that users who come after me will simply have to run the package to create all the necessary tables. Everything works outside of this archival functionality, I figured it would be easier to add the identity column through my derived column task. I'm open to alternative solutions if there is an easier way. – Calvin Ellington Feb 12 '18 at 21:31
  • 1
    You are creating a temporal archive. When making data temporal you only need to add 2 columns, although 3 can make it simpler. You will need a new primary key column. For this I suggest an auto-auto incrementing identity. You will need a creation date for each record. You can use the lead function on the original primary key ordered by creation date to determine the inactive timestamp for each record, although creating this through an ETL process will simplify querying this table. Additionally, look into re-mapping relationships using new PKs, as the old relationships become 1 to many. – Andrew O'Brien Feb 13 '18 at 01:34

1 Answers1

1

1st Method - Add an Identity column

This is the most simple and recommended method, if there is no need to use the identity value within the package, just add an Identity column to the destination table

ALTER TABLE [Archival Table]
ADD Tx_ID INT IDENTITY(1,1)

2nd method - using a Script Component

  1. Before DataFlow Task add an Execute SQL Task that return the MAX(ID) from this table

    SELECT MAX(Tx_ID) FROM [Archival Table]
    
  2. Store the result in a Variable (ex @[User::MaxID]) using a Single Row ResultSet

  3. In the DataFlow Task Add a Script Component, Mark @[User::MaxID] as ReadOnly Variable
  4. Add an Output Column of type DT_I4 (ex: NewTxID)
  5. In the Script Editor use the following Code (i used Visual Basic language)

    Imports System  
    Imports System.Data  
    Imports System.Math  
    Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper  
    Imports Microsoft.SqlServer.Dts.Runtime.Wrapper  
    
    <Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _  
    <CLSCompliant(False)> _  
    Public Class ScriptMain  
        Inherits UserComponent 
    
        Private CurrentID as Integer  = 0
    
        Public Overrides Sub PreExecute()  
            MyBase.PreExecute()  
    
            CurrentID = Me.Variables.MaxID
    
        End Sub  
    
        Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)  
    
            CurrentID += 1
    
            Row.NewTxID = CurrentID
    
    
        End Sub 
    
    End Class
    
  6. In the OLEDB Destination Map the NewTxID column to the destination identity column

References

Hadi
  • 36,233
  • 13
  • 65
  • 124
  • Thank you, the first method looks like what I will have to do. If I want to automate the process I may have to write some code that creates the table and kicks of the SSIS package. – Calvin Ellington Feb 13 '18 at 18:31