I have a Stored Procedure that extracts data from a ERP into a datawarehouse (DW). The problem that I facing is that the people from the DW asked me to have an unique ID for each row which I don't have.
An Example
CREATE TABLE [dbo].[INVOICES](
[InvID] [nchar](10) NULL,
[LineID] [nchar](10) NULL,
[Amount] [decimal](32, 4) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[DISTRIBUTION](
[InvID] [nchar](10) NULL,
[LineID] [nchar](10) NULL,
[Percent] [decimal](3, 2) NULL
) ON [PRIMARY]
InvID LineID Amount
---------- ---------- ---------------------------------------
FC0000001 1 6788.0000
FC0000001 2 8908.0000
InvID LineID Percent
---------- ---------- ----------------------
FC0000001 1 10
FC0000001 1 90
FC0000001 2 100
So, you can query:
SELECT I.InvID + '-' + D.LineID AS ID, D.*
FROM [INVOICES] AS I
LEFT JOIN [DISTRIBUTION] AS D
ON D.InvID = I.InvID
and D.LineID = I.LineID
And will get:
ID InvID LineID Percent
-------------------- ---------- ---------- ----------------------
FC0000001-1 FC0000001 1 10
FC0000001-1 FC0000001 1 90
FC0000001-2 FC0000001 2 100
But, as you see, If there is two records at [DISTRIBUTION] for the same LineID and InvID, you get two records from the query above and, the table [DISTRIBUTION] doesn't contain an ID for each line.
The Problem
So, I need to create an ID on the fly that is:
- Unique (each row must be different)
- Repeteable (every time I run the process, the same records must have the same ID value)
- I can't have an ID at the ERP level. I can't just add and ID to the [DISTRIBUTION] table because it's a third party, poorly designed ERP.
What I've already tried
- Binary Check sums are not safe and, if data change the ID changes.
- ROW_NUMBER() is what i'm using but... I'm not confortable with it. Affects performance and, there is a chance that it changes.
- Quiting my job is not an option :)