I'm an analyst preparing Tableau reports with analysis for other teams. I would like to get some workload of my shoulders by creating a data source so optimized, that the users will be able to use it to get the data they need and do the analysis by themselves.
Current situation:
We use Amazon Redshift. We have tables with raw data coming directly from the systems. Also we have some transformed tables for easier work. All in all, it's tens and tens of tables. We are using Tableau desktop and Tableau server.
Desired situation:
I would like to retain access to the raw data so I can backtrack any potential issues back to the original source. From the raw data, I would like to create transformed tables that will allow users to make queries on them (two-layer system). The tables should contain all the data a user might need, yet be simple enough for a beginner-level SQL user.
I see two ways of approaching this:
- Small number of very large tables with all the data. If there are just a couple of tables that contain maximum amount of data, the user can just query one table and ask for columns he need. Or, if necessary, join one or two more tables to it.
- Many small and very specialized tables. User will have to do multiple joins to get the data he needs, but all the tables will be very simple so it will not be difficult.
Also, access permissions to the data need to be considered.
What do you think is a good approach to solving my issue? Is it any of the two above mentioned solutions? Do you have any other solution? What would you recommend?