2

Goal

I aim to create SSIS (ETL) Template that enables audit functionality (Audit Dimension). I've discovered a few ways to implement audit dimension that are described below with some reference links below:

SEQUENCE

Primary Key

Best way to get identity of inserted row?

Environment:

  • There are millions of rows in a fact tables and packages run a few times a day.
  • Incremental ETL gets thousands of rows.
  • SQL Server 2012 BI edition is used for the BI solution.

Simplified Schema of DimAudit table:

CREATE TABLE [dw].[DimAudit] (
       [AuditKey] [int] IDENTITY(1 ,1) NOT NULL,
       [ParentAuditKey] [int] NOT NULL,
       [TableName] [varchar] (50) NOT NULL DEFAULT ('Unknown'),
       [PackageName] [varchar] (50) NOT NULL DEFAULT ('Unknown'),
       [ExecStartDate] [datetime] NOT NULL DEFAULT ( getdate()),
       [ExecStopDate] [datetime] NULL,
       [SuccessfulProcessingInd] [char] (1) NOT NULL DEFAULT ('N'),
 CONSTRAINT [PK_dbo.DimAudit] PRIMARY KEY CLUSTERED
(
       [AuditKey] ASC
)
) ON [PRIMARY]

ALTER TABLE [dw].[DimAudit]  WITH CHECK ADD  CONSTRAINT [FK_DimAudit_ParentAuditKey] FOREIGN KEY( [ParentAuditKey])
REFERENCES [dw]. [DimAudit] ( [AuditKey])
GO

ALTER TABLE [dw].[DimAudit] CHECK CONSTRAINT [FK_DimAudit_ParentAuditKey]
GO

Primary Key Option:

Primary key is generated in the audit table and then AuditKey is queried.

enter image description here

Task: Master SQL Audit Generates Key (SQL Task)

INSERT INTO [dw].[DimAudit]
           (ParentAuditKey
           ,[TableName]
           ,[PackageName]
           ,[ExecStartDate]
           ,[ExecStopDate]
           ,[SuccessfulProcessingInd])
     VALUES
           (1
           ,'Master Extract Package'
           ,?
           ,?
           ,?
           ,'N')

SELECT AuditKey 
FROM [dw].[DimAudit] 
WHERE TableName = 'Master Extract Package' and ExecStartDT = ?  
/*
Last Parameter: ParameterSystem::StartTime
Result Set populates User::ParentAuditKey
*/

Task: Master SQL Audit End (SQL Task)

UPDATE  [dw]. [DimAudit]
SET    ParentAuditKey = AuditKey
       ,ExecStopDT = SYSDATETIME()
       ,SuccessfulProcessingInd= 'Y'
WHERE   AuditKey = ?

/*
Parameter: User::ParentAuditKey
*/

SEQUENCE Option:

The sequence option does not select primary key (AuditKey) but uses logic below to create next available AuditKey.

CREATE SEQUENCE dbo . AuditID as INT
        START WITH 1
        INCREMENT BY 1 ;
GO

DECLARE @AuditID INTEGER ;
SET @AuditID = NEXT VALUE FOR dbo. AuditID ;

Best way to get identity of inserted row?

It feels risky using identity options as ETL packages could be executed in parallel.

Question

What is the recommended practice for audit dimension table and managing keys?

Sequence & primary key options do the job; however, I have concerns about the selecting primary key option because package could be executed the same millisecond (in theory) and therefore, a few primary keys would exist. So, Sequence sounds like the best option.

Is anything better I could do to create Audit Dimension for a data mart?

Community
  • 1
  • 1
BI Dude
  • 1,842
  • 5
  • 37
  • 67

1 Answers1

0

You could use the OUTPUT syntax:

INSERT INTO [dw].[DimAudit]
       (ParentAuditKey
       ,[TableName]
       ,[PackageName]
       ,[ExecStartDate]
       ,[ExecStopDate]
       ,[SuccessfulProcessingInd])
OUTPUT inserted.AuditKey
VALUES
       (1
       ,'Master Extract Package'
       ,?
       ,?
       ,?
       ,'N')

or SCOPE_IDENTITY() which is what I'm personally using:

INSERT INTO Meta.AuditDim (
      Date,
      UserName,
      Source,
      SourceType,
      AuditType,
      ExecutionId,
      ExecutionHost,
      ParentAuditKey,
      FileID
    )
VALUES (
      GETDATE(),
      CURRENT_USER,
      @Source,
      @SourceType,
      @AuditType,
      @ExecutionId,
      @ExecutionHost,
      @ParentAuditKey,
      @FileID
    );

SELECT AuditKey FROM Meta.AuditDim WHERE AuditKey = SCOPE_IDENTITY();