8

According to The Data Warehouse Toolkit by Kimball

"The grain must be declared before choosing dimensions
or facts because every candidate dimension or fact must be consistent with the grain."

I'm so confused about this concept .Could some one illustrate what's the meaning of Grain giving an analogy or metaphor in real life to clarify the concept .

Anyname Donotcare
  • 11,113
  • 66
  • 219
  • 392

3 Answers3

9

Some examples:

  • "The sales table has a grain of DAY, STORE, PRODUCT"
  • "The sales table has a grain of DAY, STORE, PRODUCT, CUSTOMER"
  • "The sales table has a grain of HOUR, STORE, CUSTOMER"
  • "The sales summary table has a grain of DAY, STORE, PRODUCT_CLASS"

Hence the time dimension must support HOUR and DAY, and the product dimension must support PRODUCT and PRODUCT_CLASS.

David Aldridge
  • 51,479
  • 8
  • 68
  • 96
9

The grain (or granularity of the fact) refers to the 'level' at which you're taking a measurement. A fact table describes a measurement taken of a business process, so the best way to describe the grain is to describe what you get for each row. The classic example for a supermarket checkout is 'one row for every beep/scan'. This is better than saying 'one row for every day, product and store' (i.e. naming the dimensions) because it grounds it in reality.

The grain/level element is that you might be storing a row at a level of product, or you might be storing it at some grouping of products. This matters as it will determine whether you can use the product-level dimension or the group-level dimension with it.

Rich
  • 2,207
  • 1
  • 23
  • 27
2

To add to David's examples. What if your Date dimension lowest grain is a week and Sales grain is a day? That means your DIM and FACT grain is inconsistent. Also, your sales table could point out a product and it's color, but what if the lowest grain of your Product dimension is just the product and no colors? Again there is an inconstancy in your DIM and FACT lowest grain.

Paul Kar.
  • 1,293
  • 2
  • 21
  • 32