Data Vault is a methodology and architecture created to address the business of designing, implementing, and managing a data warehouse.
Questions tagged [data-vault]
73 questions
7
votes
2 answers
Data Warehouse modelling: Data Vault vs Persistent Staging Area
Consider the following two DWH architectures:
DWH with Raw Data Vault, layers:
Source systems
Staging area (truncated on every load, exact schema of source tables)
Raw Data Vault (modelled as Data Vault, contains record history, hubs/sats/links…

user3596100
- 147
- 9
6
votes
5 answers
Differences between Data Vault and Dimensional modeling?
When modeling a data warehouse, is there any reason we should favor Data Vault over Dimensional modelling? What are the major differences between these two?

jrara
- 16,239
- 33
- 89
- 120
5
votes
1 answer
How to handle deleted records (from source) in the Data Vault model?
We are building a Data Vault (2.0) model to capture SalesForce data. Like many other sources, the records in the source are soft deleted. While we are sourcing data to the Data Model, we do not want to filter any data & also capture deleted records…

Aditya
- 2,299
- 5
- 32
- 54
5
votes
1 answer
Datavault: How to get hashes for foreign key relationships (populating link tables)
I've read the data vault book end to end, but I'm still trying to resolve one specific thing related to how you'd populate the link tables (how to get all hashes for that). From the blog of scalefree: massively parallel processing, it demonstrates…

radialmind
- 279
- 2
- 15
5
votes
1 answer
How to integrate NoSQL with Data Vault 2.0 Modeling? How to use hash keys to integrate NoSQL DB?
I would like to learn more about how to integrate NoSQL databases to an architecture centered on the relational model (build according to Data Vault 2.0 Standards). Does anyone have an idea of where I could educate myself on the subject. This is…

user2058291
- 107
- 3
- 10
4
votes
1 answer
Data Vault 2 - Hash diff and recurring data changes
I am having issues retrieving the latest value in a satellite table when some data is changed back to a former value.
The database is Snowflake.
As per Data Vault 2.0, I am currently using the hash diff function to assess whether to insert a new…

Roberto B.
- 41
- 5
4
votes
1 answer
Is Sales Transaction modeled as Hub or a Link in Data Vault 2.0
I'm a rookie in Data Vault, so please excuse my ignorance. I am currently ramping up and modeling Raw Data Vault in parallel using Data Vault 2.0. I have few assumptions and need help validating them.
1) Individual Hubs are modeled for:
a)…

Ragi
- 51
- 1
- 5
3
votes
2 answers
Data Vault Modelling
Assuming the following data architecture:
Source Systems -> Data Warehouse (using the data vault model) -> Data Virtualization -> Consumption Layer (e.g., BI Tools & reporting)
I read that for data vault, one of the key principles is to load raw…

SQLUser
- 113
- 8
3
votes
1 answer
How to handle data vault hubs with no business key?
We have a project for loading data from and external source into a Data Vault Data Warehouse. The data are salary statements between and employer and an employee.
When starting to modelling this we find two business key the company id of the…

Peter Å
- 1,269
- 11
- 20
3
votes
2 answers
Datavault - hard rules (rawvault) vs soft rules (businessvault)
I have a question on hard rules (rawvault) and soft rules (businessrules).
The example I have is a source system has a denormalized table called Pets where Pets contain Cats, Dogs, and Birds where they are distinguished by a Type Code (1 – cat, 2 –…

peterlandis
- 645
- 1
- 7
- 17
3
votes
2 answers
Data Vault 2.0 in SQL Server
In data vault 2.0 one hashes the business key and takes this hash as a primary key of the table.
Also Link Tables use the hash primary key to create a relationship.
My problem is with hashes that are basically random, the query optimizer cannot…

tuxmania
- 906
- 2
- 9
- 28
2
votes
1 answer
If a payment contains 1-1 mapping with transfer and each transfer contains 1-many mapping with attempts, how should I model my data vault
Trying to model a process on process in DV 2.0.
If a payment contains 1-1 mapping with transfer and each transfer contains 1-many mapping with attempts, how should I model my data vault
Unable to understand how to model these scenarios of one…

Programmeur
- 190
- 1
- 14
2
votes
1 answer
Does replacing merge statements over several tables in a data vault model with conditional insert all into will reduce ingest time?
I am loading data on daily basis into a data vault model on Snowflake data warehouse.
I have split the ingest script (javascript procedure) into 3 main parts for logging purposes.
Getting data into temporary table
Metadata part, where I add data…

alim1990
- 4,656
- 12
- 67
- 130
2
votes
2 answers
Snowflake how can we loop over each row of a temp table and insert its values with into another table where each field with its value is a single row?
We are loading data into a fact table, we the original temporary table on Snowflake looks like the following:
Where indicator_nbr fields are questions asked within a survey.
We are using data modelling techniques in building our warehouse database,…

alim1990
- 4,656
- 12
- 67
- 130
2
votes
0 answers
Historical Load SCD2 In Kimball model involving multiple source tables
I am keen find an efficient design solution, that Ralph Kimball's model propose to handle historical load of SCD Type 2 dimension, involving multiple source tables, without using a PIT Table.
The source data is comprises of many CDC enabled tables,…

DataGuy
- 21
- 2