1

I am working on a project, that has multiple processes and each process has different data items to process. Data items (For different processes) have different columns (but always the same columns for the same process).

At first, I assumed that it would be fine to have a table for all of the processes and then, whenever a new process is created, another table with the item data could be created as well, but it turns out, that there would be a new process way to often to create new tables all the time. Then I was looking into nested tables but found out that there is no concept of the nested tables in MySQL. (I've heard that this could be done with MariaDB. Has anyone worked with it?)

To make it a bit more clear here is the current concept (columns and values here are only approximate to make the concept more clear):

process_table:

ID | process_name | item_id | ...
---------------------------------
1  | some_process | 111     | ...    
2  | other_process| 222     | ...    
3  | third_process| 333     | ...    
4  | third_process| 444     | ...
...

item_tables:

item_table_1:
ID | Column1 | Column2 | process_name | ...
--------------------------------------
111| val1    | val2    | some_process | ...
...

item_table_2:
ID | Column4 | Column5 | process_name | ...
--------------------------------------
333| val1    | val2    | third_process| ...
444| val3    | val4    | third_process| ...
...

So then for each new process, there would be new item_table and for each process, it needs to have different column names, and in item table, the specific item would be linked to 'item_id' column in the process table.

I think that the easiest solution (when creating new tables all the time is not an option) for this would be nested tables, where, in the process table, there could be another column, that would hold the item_table values and then those could have different columns based on the process itself.

So the big question is: Is there at least anything similar to nested tables or anything else in MySQL that would help me implement structure like this without creating new tables all the time, and if not, then maybe there are some tips or reviews about MariaDB? Maybe someone has already implemented nested tables with it (If that is possible at all)

One of the solutions would be to have one table for the 'item_table' and then have one column for all the different values for processes, that would be stored in JSON format for example, but this would make it a lot harder to read the table.

For example:

item_table:
ID | process_name | data
--------------------------------------
111| some_process | {values: {column1:val1,column2:val2,...}}
Oskars
  • 407
  • 4
  • 24
  • Have you considered either normalising the data, or having one column in the items table for each possible column? I.e. just put in Col1|col2|col3|col4, and only populate the ones you need? – JeffUK Jan 21 '20 at 09:12
  • This is a wrong data normalization and will bring you a lot of issues in future. If you need different column names, then better write a JSON object for each ID as in your example. MySQL has native JSON support now, so the reading would be quite easy. – mitkosoft Jan 21 '20 at 09:16
  • @mitkosoft great, thanks, did not realize MySQL has JSON support now. I will check that out. – Oskars Jan 21 '20 at 09:21

1 Answers1

1

Do you use the values from the items-table for processing or something like that (do you run queries against them)?

This table/database structure looks.. ineffecient and unmaintainable imo.

This should all be done with just two tables. The processes table and the items table that contains the process_id (not the name) from the processes table.

If the column count for the items is always the same, just use "generic" names for the values like value_1, value_2 (or whatever suits best for the process) or a json/blob/varchar field with a JSON string for example. (depends if you need to run queries against this data)

id | process_id | data

EDIT: Your edit and second solution should be the way to go. "easy readability" has no priority above functionality and performance.

Bert Maurau
  • 979
  • 5
  • 21
  • A json/blob/varchar field is just as inefficient and unmaintanable; it would be better to store any columns as columns. – JeffUK Jan 21 '20 at 09:21
  • Thanks, I will check out the MySQL support for JSON. The whole problem raises from the fact, that it is not possible to normalise the column names (Could normalise the amount of columns, but only if it is fine for some processes to use only few of the columns and leave the rest empty) – Oskars Jan 21 '20 at 09:24
  • @JeffUK True, but depends if you're using the values for any queries or database operations, or to just solely "store" the process info, then this would offer more flexibility in case the process column amount changes in the future, or you want to store extra info down the road I think? – Bert Maurau Jan 21 '20 at 10:21
  • I've noticed the added 'already answered question' references, and this would indeed be the more efficient way to go. – Bert Maurau Jan 21 '20 at 10:22