0

Currently, i am in the process of designing a schema for storing product information which in the end will be used for analyzing the price changes, availability of the product itself. The granularity of the analysis can varied, whether its on item level which is the lowest or article level which is a step above.

According to Kimball, i need to create a fact table based on the business processes over the granularity of it, so this is why currently i have 3 fact tables on hand. Other than the granularity that is different for each processes, the frequency of change is also can be different from one to another.

High Level of star Schema

Is this the correct approach to do so? If yes, how do I connect the 3 fact table to if there is a need to make an analysis using 2 fact table for example. So cross table in this case. Feel free to ask anymore question if something is not very clear.

  • 1
    Hi - I’m not sure what those link tables are for? In a Kimball model you join Dimensions via your Fact tables. If you want to pull information from multiple fact tables then you need to query them (and their associated dimensions) separately and combine the resultsets using the common dimensions – NickW Nov 26 '22 at 10:11

1 Answers1

-1

The approach you have described, based on Kimball's dimensional modeling principles, is generally a good way to design a schema for storing and analyzing product information. Having separate fact tables for different business processes and different levels of granularity is a common practice in dimensional modeling.

To connect the three fact tables and perform cross-table analysis, you can use shared dimensions and conformed dimensions. Shared dimensions are dimensions that are common to multiple fact tables, while conformed dimensions are dimensions that have the same meaning and structure across different fact tables.

In your case, you can identify the dimensions that are shared across the three fact tables, such as Product, Time, and Location. Create separate dimension tables for each of these shared dimensions and ensure they have the same structure and meaning across all three fact tables.

For example, you can create the following dimension tables:

  • DIM_PRODUCT: Contains information about products, such as product ID, product name, category, and other relevant attributes.
  • DIM_TIME: Contains information about time, such as date, month, quarter, and year.
  • DIM_LOCATION: Contains information about locations, such as location ID, city, state, country, and other relevant attributes.

Each fact table should have foreign keys to the corresponding dimensions. For example, the FACT_TABLE_PRICE_CHANGES can have foreign keys to DIM_PRODUCT and DIM_TIME, while FACT_TABLE_AVAILABILITY can have foreign keys to DIM_PRODUCT, DIM_TIME, and DIM_LOCATION.

To perform cross-table analysis, you can join the fact tables using the shared dimensions. For example, if you want to analyze the price changes and availability together, you can write a query that joins FACT_TABLE_PRICE_CHANGES and FACT_TABLE_AVAILABILITY on the shared dimensions (e.g., DIM_PRODUCT and DIM_TIME) to analyze the data across both tables.

Your source query for example can be like the following :

SELECT *
FROM FACT_TABLE_PRICE_CHANGES pc
JOIN FACT_TABLE_AVAILABILITY av ON pc.Product_ID = av.Product_ID
                               AND pc.Time_ID = av.Time_ID
WHERE pc.Date = '2023-06-21';

In this query, you are joining the two fact tables on the shared dimensions (Product_ID and Time_ID) to analyze the price changes and availability on the specified date.

Amira Bedhiafi
  • 8,088
  • 6
  • 24
  • 60