A datawarehouse without measures

Question

Am I doing this correctly? There's no measure so this is throwing me off a bit.

I am designing my database to hold records of user profiles. The Users can come in and edit profile on a front end portal that links to the this DB when records are edited/updated/deleted. The DB also needs to produce XML feeds for a public website.

The warehouse:

@JohnHC Not necessarily for a datawarehouse. I think OP might be thinking of the Data Vault paradigm. (-edit- Dimensional Modelling rather; sorry) — , Nov 02 '16 at 15:26
Depends what you are creating this datawarehouse for. Where do you want to report on? — Jens, Nov 02 '16 at 15:29
I am designing my database to hold records of user profiles. The Users can come in and edit profile on a front end portal that links to the this DB when records are edited/updated/deleted. The DB also needs to produce XML feeds for a public website. — Lauren Buxton, Nov 02 '16 at 15:56
@LaurenBuxton Please add the further information you've given in comments into the original question. — Jo Douglass, Nov 02 '16 at 16:42

score 1 · Answer 1 · answered Nov 02 '16 at 15:26

1

Yes, a fact table can exist without measures, it is called a factless fact table.

Please inform more on : http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/factless-fact-table/ and other documentation.

answered Nov 02 '16 at 15:26

Radu Maftei

81
9

What would I set as PK in the FACT please, PersonalID? The rest would be FKs? – Lauren Buxton Nov 02 '16 at 16:16

score 0 · Accepted Answer · edited May 23 '17 at 11:48

0

While you absolutely can have a fact table without measures - as RaduM has linked to an explanation of - if you have no measures anywhere in your model I would question whether this database should use a dimensional model at all.

Dimensional models are intended for BI functions - data analysis, reporting, feeding into cubes, etc. Your description in a later comment about the use of this database seems to suggest this database is actually just the back end database for a website? If so, I would suggest avoiding dimensional modelling altogether. A standard normalised data model is likely to be far more suitable.

Data warehouses are normally secondary datastores which are not your live application database. Data is pulled from your primary sources into the data warehouse for reporting and analytics needs.

Transactional databases - like the one you are describing - are generally modelled in a more standard and more highly normalised manner. The usual gold standard is third normal form or higher. If you're unclear on the rules of database normalisation and the concept of third normal form, then I would strongly suggest that you obtain some training on this (there are online tutorials around if you search), and then have a crack at remodelling your scenario in this way. If you get stuck, post up a new question with the problem(s) you're running into.

You might also find this previous question helpful - it describes the difference between OLTP and OLAP. While you're not using OLAP, dimensional models are often used as the the RDBMS layer behind an OLAP database:

What are OLTP and OLAP. What is the difference between them?

edited May 23 '17 at 11:48

Community

1
1

answered Nov 02 '16 at 16:47

Jo Douglass

2,055
1
19
30

The profiles that will be displayed on a web front end where users are able to edit and update certain fields. Some fields are being brought through from the data warehouse that are fixed, the user can not edit, these fields come from primary source DBs like HR DBs and other DBs. The profiles will link to the DW and update/delete/edit when users have submitted there profiles and they have been accepted by another user called a moderator. – Lauren Buxton Nov 03 '16 at 10:00
The reason I set it out in the star schema approach is because the personalID has a one to many relationship with DepJob and a one to many relationship with Research project, similarly ResearchArea and StudentSupervisor. If I were to put in one normalised table I would be repeating the Personal information 20/30 times over for each personalID. After explaining this, is it still best practice to go for a normalised table, if so please could you explain the benefit of this? Sorry this is the first time I have come up against something like this, this feed is really helping me so thank you! – Lauren Buxton Nov 03 '16 at 10:01
@LaurenBuxton A properly normalised database minimises repetition of data. If you find in a normalised model that you are storing duplicated data then it is not actually normalised. If one person can have more than one research project, in a normalised model you would simply put a foreign key from ResearchProject to Person. If the relationship is actually many-to-many, you just need to make a table called something like PersonResearchProject, which would hold a foreign key to Person, and a foreign key to ResearchProject. – Jo Douglass Nov 03 '16 at 10:16
Hi Jo, hope you don't mind me calling you that, thanks for coming back to me. The structure in the image above is normalised, there is no repetition, but I thought this is what you would called dimensional, getting a bit confused.. Would this be the best approach for a DB that is being used for a web app? What's the difference between dimension models and normalised? – Lauren Buxton Nov 03 '16 at 10:35
No worries, it is my name. :) I would say we're getting a little past the remit of this site - I'd suggest looking up some tutorials on database normalisation. For instance: http://www.studytonight.com/dbms/database-normalization.php. A database normalised to 3rd normal form or higher is the gold standard for a relational application database, and will be a very simple model for your case. I would suggest reading up, having a crack at it, and asking a new question if you get stuck (showing us what you have so far). – Jo Douglass Nov 03 '16 at 10:46
Continuing from the previous comment - so more highly normalised models are the standard for transactional systems (which is what you're describing). Dimensional models are *only* for reporting and analytics purposes. You simply should not use one unless that is the purpose of the database. It's a very specialised type of modeling for specific use cases. They are essentially meant for read-only situations where you want to simplify querying of wide data sets for data analysts, or you want to feed the data into a multidimensional analysis solution. – Jo Douglass Nov 03 '16 at 10:52
My model above is in 3NF, fantastic thanks for your help. – Lauren Buxton Nov 03 '16 at 12:06

A datawarehouse without measures

2 Answers2