1

Edit 1:

Because few good ppl have pointed out that my question isnt very clear, I thought I will rewrite it and make it more clear now.

So basically, I am making an app, which allows users to create his own form with his own set of input fields, with data like name, type etc. After creating his form and he publishes the form, whenever there is an entry in the form, the data gets saved into the db ofcourse. Because the form itself is dynamic. I need a way to save this data.

My first choice was JSONizing it and saving. But because I cannot do any SQL queries on them, if I save in JSON format, i am eliminating this option.

Then the simple method is storing in a table like (id, rowid, columnname, value) and i keep the rowid same for all row data. But in this way, if a form contains 30 fields, after 100 entries my db would have 3000 rows. so in the long run, it would go huge and I think queries will get slow when there are millions of rows in the table.

Then I got this idea of a table like (id, rowid, column1, column2...column100). And i will save all the inputs in the form into single row. In this way it does add only 1 row per submit and its easier to query too. I will store the actual column names and map them to the right column(number) from there. This is my idea. column100 because 100 is the maximum inputs the user can add in his form.

So my question is, whether my idea is good, or should I stick to the classic table.

  • Your question is a bit vague, even warm and fuzzy, but arbitrarily adding columns in a pivot fashion won't scale, as you have already realized. In addition, adding 10 times as many rows does not necessarily mean ten times worse performance, especially if you have setup indices properly. – Tim Biegeleisen Aug 15 '16 at 10:13
  • 1
    In a nutshell, SQL loves rows and hates columns. Of course, in reality, it's a lot more complicated than that. Normalisation is key. – Strawberry Aug 15 '16 at 10:17
  • Generally speaking, a dynamic schema is a *bad idea*. Perhaps the [Entity–attribute–value model](https://en.wikipedia.org/wiki/Entity-attribute-value_model) will suffice? Hard to say without understanding your actual use case. – eggyal Aug 15 '16 at 10:17
  • I edited my question completely, and gave an explaination more clearly this time. hopefully. – kapilsriram Aug 15 '16 at 10:30
  • The "simple method" is basically the EAV approach I recommended above; with suitable indexing, it should be fine. – eggyal Aug 15 '16 at 12:22

2 Answers2

2

If I've understood your question, you need to have to design a database structure to store data whose schema you don't know in advance.

This is hard - there's no "efficient" solution in relational databases that I'm aware of.

Option 1 would be to look at a non-relational (NoSQL) solution instead.

I won't elaborate the benefits and drawbacks, as they are highly dependent on which NoSQL option you choose. It's worth noting that many relational engines (including MySQL) allow you to store and query structured data formats like JSON. I've not used this feature in MySQL myself, but similar functionality in SQL Server performs very well.

Within relational databases, the common solution is an "Entity/Attribute/Value" (EAV)schema. This is sorta like your option 2.

EAV designs can theoretically store an unlimited number of columns, and an unlimited number of rows - but common queries quickly become impossible. In your sample data, finding all records where the name begins with a K and the power is at least 22 turns into a very complex SQL query. It also means the application needs to enforce rules of uniqueness, mandatory/optional data attributes, and data transformation from one format to another.

From a performance point of view, this doesn't really scale to complex queries. This is because every clause in your "where" needs a self join, and indexes won't have a big impact on searches for non-text strings (searching for numerical "greater than 20" is not the same as searching for a text "greater than 20".).

Option 3 is, indeed, to make the schema logic fit into a limited number of columns (your option 1).

It means you have a limitation on the number of columns, and you still have to manage mandatory/optional, uniqueness etc. in the application. However, querying the data should be easier - finding accounts where the name starts with K and the power is at least 22 is a fairly straightforward exercise.

You do have a lot of unused columns, but that doesn't really impact performance much - disk space is so cheap that all the wasted space is probably less space than you carry around in your smart phone.

Community
  • 1
  • 1
Neville Kuyt
  • 29,247
  • 1
  • 37
  • 52
  • awesome, this is exactly my problem. because i have already into MySql I would eliminate your option 1. Within Option 2 and Option 3, what would be best. From what I see, if I only use them to fetch data from ids instead of complex quering, option 3 seems right. But If I just want to sort them by some column, and search within, what would be better is it Option2 or 3? – kapilsriram Aug 15 '16 at 12:19
  • There is no universal "best" - it all depends on your constraints. If you are confident you will never need more than 100 columns, I'd go for a 100 columns table. It makes querying much easier. – Neville Kuyt Aug 15 '16 at 12:22
  • I also could figure it out little like this, but my real doubt was, because i am going to create 100 columns table, and i would hardly use all of the 100 columns every time, in an average only 20 maximum, 100 is just a extreem case. So I felt, maybe I am waste lots of space or performance. What do u think. – kapilsriram Aug 15 '16 at 12:25
0

If I understand your requirement, what I will do with your requirement is to create a many to many relationship something like this:

(tbl1) form: 
    - id 
    - field1
    - field2

(tbl2) user_added_fields:
    - id 
    - field_name

(tbl3) form_table_user_added_fields: 
    - form_id (fk)
    - user_added_fields_id (fk)

This may not likely to solve your own requirements, but I hope this will give you a hint. Happy coding! :)

jomsk1e
  • 3,585
  • 7
  • 34
  • 59
  • Thanks mate, but I think I wasnt clear in my question, i just re-edited it completely, can you check it now if possible. And see if it gives clearer idea. – kapilsriram Aug 15 '16 at 10:32