4

I'm reading Ralph Kimball's book on Data warehouse and Dimension Modeling. I am reading one of the case studies, and it is about dimension modeling for an order system, where the requirement is to capture an order lifecycle, from order to fulfillment to shipped.

So, I was thinking that maybe they would suggest to have multiple lines with a transaction type FK to a transaction dimension. However, the book suggests instead to create 'role-playing' dimensions - create multiple date dimensions tables (one for order date, one for fulfillment, and one for shipped). Each one of them would then have a foreign key into the fact table, and therefore the fact table would have three columns to relate this.

Isn't this kind of restricting? Wouldn't a line-per-transaction be a better choice?

Nickolay
  • 31,095
  • 13
  • 107
  • 185
M.R.
  • 4,737
  • 3
  • 37
  • 81
  • Read again, this is not correct. There is only one date dimension table. – Damir Sudarevic Apr 25 '11 at 11:51
  • I've read it again - its one date dimension, but its multiple fields in the fact table. The other date dimensions are what is known as 'role playing' dimensions, but either ways, it lead to effect that there are multiple date dimensions.. – M.R. Apr 25 '11 at 15:02
  • It is called "accumulating snapshot" as in this similar example http://stackoverflow.com/questions/3918604/fact-table-with-different-update-schedules/3919196#3919196 – Damir Sudarevic Apr 25 '11 at 15:54

1 Answers1

2

Design often involves trade offs, and it's hard to know what design is best without a lot of details on the entire system.

But my take on this: the table from the book with three separate columns, would likely speed up queries. Data warehouses are often denormalized like this to increase query performance, at the expense of simplicity and versatility of input.

Seems like a good answer to me: your line per transaction sounds better for the data capture tables that store the day to day transactional data, but not as great for analysis.

Jamie F
  • 23,189
  • 5
  • 61
  • 77
  • In a lifecycle we could have "Start" , "InProcess" and "End"..which can be tracked by 3 columns. However, if I needed to analyze the different "InProcess" statues or if I had more statues what is the direction I need to take? Any insights on this? – Franklin Nov 24 '12 at 08:38