1

I am building a Data Warehouse from scratch. I am alone in the company, so everything will be implemented by me. My experience with Data Vault is zero. Some information on the net claims that building Data Vault, is not recommended for low-resouce teams like mine.

The question would be wether to implement Data Vault or just go with a traditional approach (Kimball). I know you can also do both together. But my doubt is wether to implement Data Vault or not.

I would like to know if the statement that Data Vault is not appropiate for small teams is justifiable or not.

Thank you in advance

I read tons of articles and attended webinars. Most did not mention that Data Vault was inappropiate for small teams. Probably because they assume the teams are not a 1-person team

greenglas
  • 29
  • 4
  • This is an opinion based question. My opinion is that Data Vault is only suited for enormous enterprise customers. Even after after you've done the enormous amount of data engineering required to build it, it needs ongoing maintenance because any bugs are magnified by the solution. If you want analytics, do not build a data vault. After you build it you still have to string together all of the hubs, satellites, links to make a useful star schema. And now you have another transformation layer that is a barrier to fast implementation – Nick.Mc Oct 31 '22 at 10:44
  • Carefully consider what use case you hvae for a data vault. It's a source system data capture layer but does very little to support actual analytics. If you need analytics then the proven approach is Kimball. You just need to decide whether you have a use case for bolting a big complicated data vault to the front of it. – Nick.Mc Oct 31 '22 at 10:47
  • If I read through the article you linked, do you have any critical need for the use cases cited? Note in the diagram you have "raw vault" which is the thing generated by metadata. This gulps huge amounts of data . Then you hvae business vault which is not very well explained but it actually tries to turn raw vault into something vaguely usable by the business. Then you have "Information Marts" which are basically the star schemas we have been building for decades, except that now it takes longer to get data into them because you have these redundant layers at the start – Nick.Mc Oct 31 '22 at 10:50
  • In short I am not a fan of over engineering when there is no requirement. Maybe have a read of this and try and find some examples of the kind of queries you need to write against a raw vault https://timi.eu/blog/data-vaulting-from-a-bad-idea-to-inefficient-implementation/ – Nick.Mc Oct 31 '22 at 10:51
  • Opinion warning. It's not a question about team size but more the project dynamics. I've been trying to implement Kimball's Enterprise Service Bus in a company where source files and their implementation was constantly changing. Switching to DV was a game changer to me and my team of 6 people. Putting an effort into loading data "as is" without applying business rules and interpretations for missing data made it a lot of easier to reload when stakeholders changed their mind. This business rules ignorance is the best part of the model to me, and the model itself supports it pretty well. – najczuk Nov 03 '22 at 00:47

0 Answers0