7

I want to design a web application for keeping track of the finance of the members of an organization. Certain functions are very similar to Splittr. I define my requirements using the MWE database diagrams:

  • "Finance" tables: Each user will have one personal finance account, for which I am using the following three red tables: enter image description here
  • "SharedExpense" tables: Each user can have shared expenses with other users in many 'shared-expense-groups'. For each group, I am using the following three blue tables: enter image description here

(Note how each user can define amount of their share, and own category of the shared expense. UserShare table uses a composite primary key.)

The problem: I have to relate the users to their 3 personal "Finance" tables and the 3N "SharedExpense" tables (where N is the number of 'shared-expense-groups' the user belongs to).

Attempted Solutions:

  1. Multiple databases. Each user has a unique database for their "Finance" tables. Each 'shared-expense-group' has a unique database on the server. I can then relate the users from one master database with the following four purple tables:
    Solution with multiple databases
    Drawbacks: Foreign keys come from different databases, large number of databases to be backed up.

  2. Multiple tables. I can create all the tables in the same database and relate all of them with the four green master tables: Solution with multiple tables
    Here, the number of tables is a potential problem. If there are M users and N 'shared-expense-groups, then there will be 3M + 3N tables!

The question: Is there more elegant and simpler database design for the purpose? If not, which of the above two solutions is better and why?

Links to relevant, previous StackOverflow Q&A:

banskt
  • 416
  • 3
  • 7
  • 17
  • 1
    depends on what your end goal is? Do you just plan on writing reports off of this design or will this be consumed by MDX/DAX models ; then your looking at another set of dimensionalized table designs. – junketsu Nov 29 '18 at 13:42
  • 1
    Which DBMS are you *really* using? SQL Server != MySQL. – Thom A Nov 29 '18 at 13:45
  • @junketsu End goal is to write reports from queries. No MDX. – banskt Nov 29 '18 at 13:52
  • @Larnu I am using MySQL. Sorry for the confusion, and thanks for editing the tags. – banskt Nov 29 '18 at 13:53
  • you can still dimensionalize these tables into Fact/Dims for lot faster querying. Minimal FK enforcement. And tables will be WIDE (esp fact tables). – junketsu Nov 29 '18 at 13:59
  • @junketsu Thanks, so that will be another approach. Do you mean I should store the accounts of all users in one dimension table? (similarly, categories of all users in one dimension table) But how could I break down the "SharedExpense" tables into facts and dimensions? I am expecting 1000s or more shared-expense-groups. – banskt Nov 29 '18 at 14:14
  • Yes all accounts for all users go as Dim_Accounts (example table name) .But you will have a [Account_id] column which you will anchor as FK into FACT table. For your problem you can also look at DATA VAULT approach (in which you can keep a table that will hold this relationship between user(s) and shared groups and assign a unique ID to each relationship) And you build your FACT table to use this unique_ID. – junketsu Nov 29 '18 at 14:22
  • Is this a reporting database only, or will this handle the daily transaction activity? (Insert, Update, Delete, etc) I don't think a dimension model is appropriate at this point, especially if you don't want to use MDX. – Wes H Nov 29 '18 at 14:27
  • @WesH this will handle daily transaction activity (with *very* frequent Insert, Update, Delete, etc.) – banskt Nov 29 '18 at 14:29
  • 2
    Do not use a Dim/Fact model for OLTP systems. Dimensional Models are excellent for reporting, but are terrible for high volume IUD activity. There are a number of other issues with your current approach. I'll try to address some of them in an answer. – Wes H Nov 29 '18 at 14:31
  • Sorry, I don't have time to write up an answer. You need to learn more about database design fundamentals. If this is a very high volume database or it will grow quickly, you may need to consider a consultant to review your design before it is built. – Wes H Nov 29 '18 at 14:57
  • @WesH I understand the time constraint, but would you have time to write down the key problems -- you don't have to explain them in details, I will find out as I learn more but some directions will be helpful. – banskt Nov 29 '18 at 17:28

1 Answers1

5

There is to much to describe all the challenges in a summary, but I'll pick out a few.

  1. Fundamental design violations: such as a table/database for each user
  2. entity design, 3NF: such as category.budget and ledger.transaction_type
  3. referential integrity/relationship design:
    • account is for one user, but account table does not contain the user id;
    • usershare is a subset of ledger, but they both point to a user;
  4. object naming concerns:
    • clear and consistent naming entities, based on real usage. Is a member a user or a user a member? If they are the same, choose one name. If they are not the same, the design is different. Do staff use client or customer rather than member?
    • consistency in your key naming. The key name should directly tie it to the source entity. Members.ID should be referenced as members_id, rather than user_id. However, see the next entry before correcting this.
    • be consistent in your entity plurality. The general consensus is that the name should describe a single record (User) rather than all the records (Users).
    • ledger.spent_on - that name is not obviously a date. It could be pointing to a user or category as well. An attribute name should describe the attribute without needing additional explanation. For example, ledger.Purchase_Date is self explanatory. It should also be clear how it relates to the entity. UserShare.Share doesn't really tell me what it contains.

Sorry to be blunt, but I would start over. Consider what you have as a good trial run and start again using the additional information you have.

  • Ask questions of your designs (Are all users members? Are all members users?). If the answer is anything other than Yes or No, break it down further.
  • Try what-if scenarios (What if a shared ledger exceeds the category budget? How will previous spending be perceived if the category budget changes?)
  • Consider what reporting questions may be asked (Who went over budget? How much are we spending on this category?) and then consider the query to answer the question.

Read up on 3NF and maybe some of the higher normalization levels as well. Whereas 3NF is pretty nearly the minimum normalization, the higher levels become increasingly specialized and may or may not be appropriate for you design.

The better you understand your data AND business, the better your design will be, and the better your end product will turn out.

Wes H
  • 4,186
  • 2
  • 13
  • 24