0

If I have a table Students that stores the first and last name of a student and I also want to store their hobbies, why is it bad to have a column that stores multiple hobbies separated by a comma or something similar, then retrieving and displaying them with some back-end language?

Why is it better to make another table hobbies and then have a separate row for every Hobby, If a student has up to 10, then that's another 10 rows, how come this is better than just having everything stored in the students table, with one row per student ? Are there any exceptions for this rule ?

Uiot
  • 83
  • 8
  • It basically depends on the use you'll give to the data. Normalization avoid duplicated data while keeping consistency and integrity, but the tradeoff is complexity on queries and space used on storage, among other things. – Alfabravo Sep 23 '19 at 17:00
  • *"Normalization avoid duplicated data while keeping consistency and integrity"* talking about data duplication @Alfabravo would you also normalize first and last student names, to save disk space (don't answer this comment was meant to be rhetorical question) ... – Raymond Nijland Sep 23 '19 at 17:10
  • @RaymondNijland Do you use rhetorical questions at work to solve actual things? (getting rhetorical in the comments, too) – Alfabravo Sep 23 '19 at 17:12
  • 2
    This has been asked and answered here: [Is storing a delimited list in a database column really that bad?](https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad) – Paul Spiegel Sep 23 '19 at 17:15
  • @Alfabravo What i meant with mine comment... Normalization is more complex then simply avoiding duplicated data mine example about normalizating first and last student names would be a over the top example where you could "overnormalize" but you already said it yourself with complexity on queries .. – Raymond Nijland Sep 23 '19 at 17:21
  • 1
    @RaymondNijland and the question itself is too broad to actually answer. That's why it is a comment instead of an answer. `among other things` tries to address that but, if you want to go all the way and explain Normalization to OP, be my guest. – Alfabravo Sep 23 '19 at 17:21
  • 1
    yes question is way to broad if not duplicated.. i think i will pass explaining normalization @Alfabravo , topicstarter can better read [Database Normalization (Explained in Simple English)](https://www.essentialsql.com/get-ready-to-learn-sql-database-normalization-explained-in-simple-english/) – Raymond Nijland Sep 23 '19 at 17:29

2 Answers2

1

Here are some reasons:

  • SQL has poor string processing capabilities, so it is better not to rely on them.
  • SQL has a great data structure for storing lists. It is called a table.
  • Hobbies sounds like an entity. It should have its own table.
  • Hobbies referred to in other tables should have proper foreign key relationships declared. This is not possible with a comma-delimited string.
  • You want the hobbies in the string to be free of duplicates. That is hard with string updates.
  • You want the hobbies in the string to all have common spellings. That is hard with string updates.
  • You may want to keep track of information such as when a hobby was added for the student. You cannot do that with a string.
  • You may want to get all students with one particular hobby. With a separate table, you can define an indexing structure for this type of query. That is harder with a single string.

Are those enough reasons?

When would you do this? You would do this when you have a string that you have no intention of processing inside the database -- it is essentially a black box. Then you do not care if it has commas or other characters in it.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

The only exception to this rule is if you are willing to the pay the price if you will ever need to use the hobbies in any kind of filter or join.

If you will ONLY EVER need to simply return the list of hobbies per student, then it is fine to store them as a comma-separated list.

If you might one day want to list all the students who have a certain hobby, or a certain number of hobbies, or join to student organizations based on hobbies, or any number of other scenarios that neither you nor I can foresee... in any of those cases, it is better to normalize.

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52