Questions tagged [data-vault]

Data Vault is a methodology and architecture created to address the business of designing, implementing, and managing a data warehouse.

73 questions
7
votes
2 answers

Data Warehouse modelling: Data Vault vs Persistent Staging Area

Consider the following two DWH architectures: DWH with Raw Data Vault, layers: Source systems Staging area (truncated on every load, exact schema of source tables) Raw Data Vault (modelled as Data Vault, contains record history, hubs/sats/links…
user3596100
  • 147
  • 9
6
votes
5 answers

Differences between Data Vault and Dimensional modeling?

When modeling a data warehouse, is there any reason we should favor Data Vault over Dimensional modelling? What are the major differences between these two?
jrara
  • 16,239
  • 33
  • 89
  • 120
5
votes
1 answer

How to handle deleted records (from source) in the Data Vault model?

We are building a Data Vault (2.0) model to capture SalesForce data. Like many other sources, the records in the source are soft deleted. While we are sourcing data to the Data Model, we do not want to filter any data & also capture deleted records…
Aditya
  • 2,299
  • 5
  • 32
  • 54
5
votes
1 answer

Datavault: How to get hashes for foreign key relationships (populating link tables)

I've read the data vault book end to end, but I'm still trying to resolve one specific thing related to how you'd populate the link tables (how to get all hashes for that). From the blog of scalefree: massively parallel processing, it demonstrates…
radialmind
  • 279
  • 2
  • 15
5
votes
1 answer

How to integrate NoSQL with Data Vault 2.0 Modeling? How to use hash keys to integrate NoSQL DB?

I would like to learn more about how to integrate NoSQL databases to an architecture centered on the relational model (build according to Data Vault 2.0 Standards). Does anyone have an idea of where I could educate myself on the subject. This is…
user2058291
  • 107
  • 3
  • 10
4
votes
1 answer

Data Vault 2 - Hash diff and recurring data changes

I am having issues retrieving the latest value in a satellite table when some data is changed back to a former value. The database is Snowflake. As per Data Vault 2.0, I am currently using the hash diff function to assess whether to insert a new…
4
votes
1 answer

Is Sales Transaction modeled as Hub or a Link in Data Vault 2.0

I'm a rookie in Data Vault, so please excuse my ignorance. I am currently ramping up and modeling Raw Data Vault in parallel using Data Vault 2.0. I have few assumptions and need help validating them. 1) Individual Hubs are modeled for: a)…
Ragi
  • 51
  • 1
  • 5
3
votes
2 answers

Data Vault Modelling

Assuming the following data architecture: Source Systems -> Data Warehouse (using the data vault model) -> Data Virtualization -> Consumption Layer (e.g., BI Tools & reporting) I read that for data vault, one of the key principles is to load raw…
SQLUser
  • 113
  • 8
3
votes
1 answer

How to handle data vault hubs with no business key?

We have a project for loading data from and external source into a Data Vault Data Warehouse. The data are salary statements between and employer and an employee. When starting to modelling this we find two business key the company id of the…
Peter Å
  • 1,269
  • 11
  • 20
3
votes
2 answers

Datavault - hard rules (rawvault) vs soft rules (businessvault)

I have a question on hard rules (rawvault) and soft rules (businessrules). The example I have is a source system has a denormalized table called Pets where Pets contain Cats, Dogs, and Birds where they are distinguished by a Type Code (1 – cat, 2 –…
peterlandis
  • 645
  • 1
  • 7
  • 17
3
votes
2 answers

Data Vault 2.0 in SQL Server

In data vault 2.0 one hashes the business key and takes this hash as a primary key of the table. Also Link Tables use the hash primary key to create a relationship. My problem is with hashes that are basically random, the query optimizer cannot…
tuxmania
  • 906
  • 2
  • 9
  • 28
2
votes
1 answer

If a payment contains 1-1 mapping with transfer and each transfer contains 1-many mapping with attempts, how should I model my data vault

Trying to model a process on process in DV 2.0. If a payment contains 1-1 mapping with transfer and each transfer contains 1-many mapping with attempts, how should I model my data vault Unable to understand how to model these scenarios of one…
Programmeur
  • 190
  • 1
  • 14
2
votes
1 answer

Does replacing merge statements over several tables in a data vault model with conditional insert all into will reduce ingest time?

I am loading data on daily basis into a data vault model on Snowflake data warehouse. I have split the ingest script (javascript procedure) into 3 main parts for logging purposes. Getting data into temporary table Metadata part, where I add data…
alim1990
  • 4,656
  • 12
  • 67
  • 130
2
votes
2 answers

Snowflake how can we loop over each row of a temp table and insert its values with into another table where each field with its value is a single row?

We are loading data into a fact table, we the original temporary table on Snowflake looks like the following: Where indicator_nbr fields are questions asked within a survey. We are using data modelling techniques in building our warehouse database,…
alim1990
  • 4,656
  • 12
  • 67
  • 130
2
votes
0 answers

Historical Load SCD2 In Kimball model involving multiple source tables

I am keen find an efficient design solution, that Ralph Kimball's model propose to handle historical load of SCD Type 2 dimension, involving multiple source tables, without using a PIT Table. The source data is comprises of many CDC enabled tables,…
1
2 3 4 5