Questions tagged [dimensional-modeling]

Dimensional modeling (DM) is the name of a set of techniques and concepts used in data warehouse design. According to Ralph Kimball, its goal is to deliver data that’s understandable to the business users and fast query performance.

Dimensional modeling is widely accepted as the preferred technique for presenting analytic data because it addresses two simultaneous requirements:

  • Deliver data that’s understandable to the business users.
  • Deliver fast query performance.

It is considered to be different from entity-relationship modeling (ER). Dimensional Modeling does not necessarily involve a relational database. The same modeling approach, at the logical level, can be used for any physical form, such as multidimensional database or even flat files. According to data warehousing consultant Ralph Kimball, DM is a design technique for databases intended to support end-user queries in a data warehouse. It is oriented around understandability and performance. According to him, although transaction-oriented ER is very useful for the transaction capture, it should be avoided for end-user delivery.

Dimensional modeling always uses the concepts of facts (measures), and dimensions (context). Facts are typically (but not always) numeric values that can be aggregated, and dimensions are groups of hierarchies and descriptors that define the facts.

Fundamental concepts for data-modeling

  • Gather Business Requirements and Data Realities - before launching a dimensional modeling effort, the team needs to understand the business and the realities of the underlying source data. Understanding the objectives based on key performance indicators and decision-making processes.
  • Collaborative Dimensional Modeling Workshops - it should be designed with subject matter experts and data governance specialists Collaboration is critical.

Four-Step Dimensional Design Process

  1. Select the business process - the operational activities performed by an organization.
  2. Declare the grain - exactly what a single fact table row represents.
  3. Identify the dimensions - “who, what, where, when, why, and how” context surrounding a business process event
  4. Identify the facts - the measurements the result from a business process event

Example

Sales amount is a fact; timestamp, product, register#, store#, etc. are elements of dimensions. Dimensional models are built by business process area, e.g. store sales, inventory, claims, etc. Because the different business process areas share some but not all dimensions, efficiency in design, operation, and consistency, is achieved using conformed dimensions, i.e. using one copy of the shared dimension across subject areas.

Links

445 questions
63
votes
8 answers

Star-Schema Design

Is a Star-Schema design essential to a data warehouse? Or can you do data warehousing with another design pattern?
S.Lott
  • 384,516
  • 81
  • 508
  • 779
12
votes
1 answer

Generating star schema in hive

I am from SQL Datawarehouse world where from a flat feed I generate dimension and fact tables. In general data warehouse projects we divide feed into fact and dimension. Ex: I am completely new to Hadoop and I came to know that I can build data…
Zerotoinfinity
  • 6,290
  • 32
  • 130
  • 206
10
votes
2 answers

joining across multiple fact tables with a dimension in between

What's a good approach to data warehouse design if requested reports require summarized information about the same dimensions (and at the same granularity) but the underlying data is stored in separate fact tables? For example, a report showing…
8
votes
3 answers

What's the Grain in the context of DW

According to The Data Warehouse Toolkit by Kimball "The grain must be declared before choosing dimensions or facts because every candidate dimension or fact must be consistent with the grain." I'm so confused about this concept .Could some one…
Anyname Donotcare
  • 11,113
  • 66
  • 219
  • 392
8
votes
2 answers

Dimensional Modelling - ambiguous relations

I've been trying to solve an issue, and to date I haven't been able to reach what I'd say is an optimal solution. I have a dimension (Features) which needs to be referenced in 2 other dimensions (Actions and Sessions), which in turn are referenced…
Zepee
  • 1,640
  • 3
  • 20
  • 40
7
votes
4 answers

Why NULL values are mapped as 0 in Fact tables?

What is the reason that in measure fields in fact tables (dimensionally modeled data warehouses) NULL values are usually mapped as 0?
jrara
  • 16,239
  • 33
  • 89
  • 120
7
votes
2 answers

Redshift Performance of Flat Tables Vs Dimension and Facts

I am trying to create dimensional model on a flat OLTP tables (not in 3NF). There are people who are thinking dimensional model table is not required because most of the data for the report present single table. But that table contains more than…
prakash
  • 165
  • 1
  • 2
  • 10
7
votes
0 answers

Font error when clicking Attribute Relationships tab in Dimension Designer in SSDT

I am new to SSDT and I'm following the Analysis Services Tutorial called Multidimensional Modeling (Adventure Works Tutorial). And I'm currently on Lesson 3, on the step Modifying the Customer Dimensions under the heading "Modifying Attribute…
7
votes
1 answer

audit table vs. Type 2 Slowly Changing Dimension

In SQL Server 2008+, we'd like to enable tracking of historical changes to a "Customers" table in an operational database. It's a new table and our app controls all writing to the database, so we don't need evil hacks like triggers. Instead we will…
Justin Grant
  • 44,807
  • 15
  • 124
  • 208
7
votes
4 answers

Handling multiple fact tables in Qlikview

I have a PostgreSQL database containing various education data such school-level test scores and enrollment figures. I need to separate enrollment from test scores because the data is on different grains. Even though enrollment is on a different…
6
votes
1 answer

Dimension for geozones or Lat & Long in data warehouse

I have a DimPlace dimension that has the name of the place (manually entered by the user) and the latitude and longitude of the place (automatically captured). Since the Places are entered manually the same place could be in there multiple time…
tember
  • 1,418
  • 13
  • 32
6
votes
1 answer

How to design a shrunken dimension table for dates in dimensional warehouse and use in SSAS?

I'm working in a situation where we're moving from having a bunch of transactional fact tables to a more complicated picture with aggregates, snapshots, etc. In the past, there were a few cases where data needed to be aggregated by month, but…
6
votes
1 answer

Naming standards for dimensional modeling

I am working on my first dimensional modeling assignment for a Data Warehouse project using Kimball's approach. As I prepare my model and think about physical objects, I wonder what is the recommended naming scheme for database objects. We're going…
vikc
  • 65
  • 1
  • 7
6
votes
1 answer

Dimension row with multiple attributes

This isn't exactly what I am doing but I feel this is a good example: Let's say I have a Product dimension table that connects to my ProductSales Fact table. Each row in dimProduct holds all the relevant data for a single product (code, name,…
5
votes
1 answer

What is the difference between fact-less fact and fact table?

What is the exact difference between fact-less fact and fact table ? I have perused couple of articles but they weren't cogent
1
2 3
29 30