0

Im in the process of setting up a database of CCG cards. Each card has keywords such as "lightning" or "water" and can have multiple keywords per card.

So the question is "Is it better to have each keyword in its own field and and save if the card has it or not OR is it better to save all the keywords in one field and parse them out or soemthing for a search function?"

The card game will potentially be adding more keywords in the future and there are multiple aspects of the card that follow the same principle as the keywords(ie having many options in one field.)

I'm somewhat worried about having a table with 50+ columns. Thanks for all help and answers in advance. If i need to clarify just let me know.

Werbel
  • 113
  • 1
  • 2
  • 15
  • 2
    First rule of database normalization: **one cell = one value** - does that answer your question? I would never ever put multiple values into a single column - that is just asking for trouble and messy code down the road - and you're not gaining anything from that approach, either.... 50 columns **IS NOTHING** for a table in any decent RDBMS .... – marc_s Aug 31 '12 at 20:37
  • Time to read a published academic textbook on information modelling, the relational model & DB design. (Manuals for languages & tools to record & use designs are not textbooks on doing information modeling & DB design.) Dozens of published academic information modeling & DB design textbooks are online free in pdf. stanford.edu has a free online course. – philipxy Jun 14 '19 at 19:22

3 Answers3

5

Neither.

You should have a table with one row per card-keyword combination.

ie

CardID   Keyword
1        lightning
1        water
2        lightning
3        mountain

If keywords are going to be repeated, you may want to add a keywords table, and have the ID in this junction table instead.

CardID   KeywordID
1        1
1        2
2        1
3        3

Keywords

KeywordID Keyword
1         Lightning
2         Water
3         Mountain

This is called "Normalisation". Opinions vary, but IMHO, in general, you should aim for "3rd Normal Form" at the very least, and preferably higher.

podiluska
  • 50,950
  • 7
  • 98
  • 104
  • Standard. This organization will allow you to do queries like "give me all cards that have Lightning and Water", which would be messy using the multi-columns or jam-all-the-keywords-into-a-string approach. If these cards have many characteristics, then you really want a Card entity table, a Keyword (or Power) entity table, and an association table that correlates Cards to Powers, as podiluksa noted. –  Aug 31 '12 at 18:24
  • So how do i get this to work if the cards are across multiple tables and have a multiple part primary key? – Werbel Aug 31 '12 at 18:35
  • @user1638987 You need to explain why those cards are across multiple tables and how they're organised before that can be explained. – podiluska Aug 31 '12 at 18:38
  • This is what I have Expansion table Name / Expansion code (PK) Cards Expansion code(PK) / Card ID(PK) / etc.. / etc.. Keywords Symbol(PK) / Card ID(PK) / Keyword I'm thinking this is what you were talking about. Assuming a card being able to have 3 or more keywords makes no difference. – Werbel Aug 31 '12 at 19:06
  • My apologies that looks like garbage when posted but I can't find how to show a table structure in a comment. Pls forgive a noob. – Werbel Aug 31 '12 at 19:11
  • 1
    @user1638987 You could always edit a question and then just refer to that edit in your comment. – Branko Dimitrijevic Aug 31 '12 at 20:46
2

If you are accessing keywords separately, yet storing them together in the same field, you are violating the principle of atomicity and therefore the 1st normal form.

This is bad! At bast, you won't be able to search your data efficiently1, and at worst you'll leave your database defenseless against invalid data2.

So definitely keep each keyword in its separate field, but that doesn't necessarily mean these fields should be in different columns. If there are different kinds of keywords that should be treated differently somehow, then having a column for each of these kinds makes sense. Also, separate columns might be justified if the keywords per card are limited to a small number.3

In your case, however, I suspect all keywords are treated equally, and you don't need any limit on the number of them, leading to a classic one-to-many relationship, as podiluska already explained.


1 Indexes don't work on "partial" fields.

2 E.g. what if character used to separate keywords within the same field is given as a part of the keyword itself, either because of the bug or even intentionally? Do you need any length restrictions, uppercase/lowercase restrictions? Etc, etc... All these things can theoretically be enforced at the application level, but there are very good reasons why enforcing them at the database level is preferable.

3 Although there are arguably better solutions for imposing such a limit.

Community
  • 1
  • 1
Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • As a note there are times when not-first-normal-form can give you better constraints than you could with normalization but I agree, this is an advanced topic and better left alone if one is asking the question. For example, suppose I want a constraint such that all general ledger line items add up to 0 for each transaction. I can't express that on a normalized database. I have to break first normal form to do so or write relatively complex triggers to try to make that work. – Chris Travers Sep 02 '12 at 10:32
  • @ChrisTravers I'm not sure I follow. Whet exactly would a table that violates 1NF allow you to do, that would be impossible with a table that doesn't? – Branko Dimitrijevic Sep 02 '12 at 11:20
  • I didn't say impossible, i said impossible without relatively complex triggers. What you can do is store an array of tuples in a column, define a constraint against that column, and then use a trigger to maintain a materalized view to make the data move over to a main table for read-queries. Since the data is append-only the model is relatively simple, the trigger is simple (compared with "pull every record in this transaction, group, aggregate, and compare). – Chris Travers Sep 02 '12 at 12:42
  • @ChrisTravers How would the constraint look like? – Branko Dimitrijevic Sep 02 '12 at 14:11
  • On PostgreSQL what you do is create a table method and then create a constraint on that. You could do something similar on Oracle but the syntax woudl be different. On PostgreSQL it would be like: `create function running_balance(journal_entry) returns numeric language sql immutable as $$ select sum(amount) from unnest($1.line_items); $$;` and then the constraint ends up looking like: `alter table journal_entry add constraint check (journal_entry.running_balance = 0)` – Chris Travers Sep 02 '12 at 14:21
1

I just want to add here that denormalization (including not-first-normal-form designs like many values in a field) are advanced stuff with lots of potential for problems. Sometimes they are needed but so far you haven't shown us any reason why. Also when you use something like this, you really want to make sure that you know what you are doing and can break your data out, if needed, into normalized forms.

So start with the normalized form. If you are asking here read up on normalization and learn to work with it. Then only where you need to (usually to express a data constraint) denormalize to the extent necessary. If you are asking which is better and don't have a really good reason for using many values in a field, you are better off breaking it off into another table as the other commentors have suggested.

Chris Travers
  • 25,424
  • 6
  • 65
  • 182