3

Could you please help understand how to populate fact table with Surrogate keys from dimensions.

I have the following fact table and dimensions:

ClaimFacts

ContractDim_SK ClaimDim_SK AccountingDim_SK ClaimNbr ClaimAmount

ContractDim

ContractDim_SK (PK) ContractNbr(BK) ReportingPeriod(BK) Code Name

AccountingDim

TransactionNbr(BK) ReportingPeriod(PK) TransactionCode CurrencyCode (Should I add ContractNbr here ?? original table in OLTP has it)

ClaimDim

CalimsDim_Sk(PK) CalimNbr (BK) ReportingPeriod(BK) ClaimDesc ClaimName (Should I add ContractNbr here ?? original table in OLTP has it)

My logic to load data into fact table is the following :

  1. First I load data into dimensions (with Surrogate keys are created as identity columns)
  2. From transactional model (OLTP) the fact table will be filled with the measures (ClaimNbr And ClaimAmount)

  3. I don’t know how to populate fact table with SKs of Dimensions, how to know where to put the key I am pulling from dimensions to which row in fact table (which key belongs to this claimNBR ?) Should I add contract Nbr in all dimensions and join them together when loading keys to fact?

What’s the right approach to do this? Please help, Thank you

Rachel
  • 208
  • 1
  • 5
  • 18

1 Answers1

9

The way it usually works:

  1. In your dimensions, you will have "Natural Keys" (aka "Business Keys") - keys that come from external systems. For example, Contract Number. Then you create synthetic (surrogat) keys for the table.
  2. In your fact table, all keys initially must also be "Natural Keys". For example, Contract Number. Such keys must exist for each dimension that you want to connect to the fact table. Sometimes, a dimension might need several natural keys (collectively, they represent dimension table "Granularity" level). For example, Location might need State and City keys if modeled on State-City level.
  3. Join your dim table to the fact table on natural keys, and from the result omit natural key from fact and select surrogat key from dim. I usually do a left join (fact left join dim), to control records that don't match. I also join dims one by one (to better control what's happening).

Basic example (using T-SQL). Let's say you have the following 2 tables:

Table Source.Sales
(   Contract_BK, 
    Amount, 
    Quantity)

Table Dim.Contract
(   Contract_SK,
    Contract_BK,
    Contract Type)

To Swap keys:

SELECT
     c.Contract_SK
    ,s.Amount
    ,s.Quantity
INTO
    Fact.Sales
FROM
    Source.Sales s LEFT JOIN Dim.Contract c ON s.Contract_BK = c.Contract_BK
    
-- Test for missing keys
SELECT 
    * 
FROM 
    Fact.Sale 
WHERE 
    Contract_SK IS NULL
RADO
  • 7,733
  • 3
  • 19
  • 33
  • 1
    Good answer, but even when you've decomposed claim dimensions, you might end up with a dim that has the same number of rows as the fact, as per what I'd call the 'Claim Details' dimension (in this case name, description). Its not ideal, but it is 'off to the side' and not used for filtering often, so not that much of a problem. Kimball design tip 140 (easily googled) agrees with the idea of having a claim as an accumulating snapshot fact. I agree the claim number should be degenerate in the fact. The accounting dim does look wrong, could be a fact all of its own? – Rich Mar 03 '18 at 12:19
  • Thank you, @Rich. Regarding "Claim details" - I solve this problem by making description field a non-additive fact. This way, you can do some calcs on it, i.e, Count of Description, or I can write a measure that displays the description content. Such design increases size of the fact table, but it's a much lesser evil compared to 1:1 sized dimensions. If claim (order, invoice, etc) has lots of small details such as flags, I always park them into junk dimensions. – RADO Mar 03 '18 at 18:18
  • Thank you for your answer it helped a lot, Could you clarify to me please if I remove ClaimNbr from ClaimDimension how I would be able to join fatTable to ClaimDim if I don't have the natural Key (ClaimNbr) in ClaimDim? should I move them to a staging area join them there and once I move them to the final destination I just don't include ClaimNbr in the Dimension? – Rachel Mar 04 '18 at 16:50
  • @Rachel: I added an edit to my answer. Hope it clarifies it for you. – RADO Mar 06 '18 at 04:59
  • @rado What if the dimension table is scd2? There will be more than one combination of natural key? – Ravi Aug 16 '18 at 01:57
  • @RaviR - scd2 essentially means that your dim table grain is business keys + date, not just business keys. Typically, it's implemented as effective date range (start date, end date) in dim, and you must have a valid transaction date in your fact table. Then you do the left join as fact.BK = dim.BK and fact.Date between dim.Start Date and dim.End Date. You just need to make sure that your start/end date ranges are constructed correctly (no overlaps, no gaps), otherwise you'll get all kinds of problems. – RADO Aug 16 '18 at 06:59
  • @rado thanks!! If you don't mind, can you check this one out https://stackoverflow.com/questions/51874230/is-it-better-to-have-a-surrogate-key-or-nkeffective-time-in-dimension-tables-in – Ravi Aug 16 '18 at 09:53
  • @RaviR - you welcome. I have aswered your other question. – RADO Aug 16 '18 at 11:51