Database modeling: best aproach for multiple categories for multiple elements

Question

Let's say I have 10 books, each book has assigned some categories (ex. :php, programming, cooking, cookies etc).

After storing this data in a DB I want to search the books that match some categories, and also output the matched categories for each pair of books.

What would be the best approach for a fast and easy to code search:

1) Make a column with all categories for each book, the book rows would be unique (categs separated by comma in each row ) -> denormalisation from 1NF

2) Make a column with only 1 category in each row and multiple rows per book

I think it is easier for other queries if I store the categories 1 by 1 (method 2), but harder for that specific type of search. Is this correct?

I am using PHP and MySQL.

PPS : I know multi relational design, I prefer not joining every time the tables. I'm using different connection for some tables but that's not the problem. I'm asking what's the best approach for a db design for this type of search: a user type cooking, cookies, potatoes and I want to output pairs of books that have 1,2 more or all matched categs. I'm looking for a fast query, or php matching technique for this thing... Tell me your pint of view. Hope I'm understood

score 2 · Answer 1 · answered Dec 30 '11 at 01:09

2

What you want to do is have one table for books, one table for categories, and one table for connecting books and categories. Something like this:

books

book_id | title | etc

categories

category_id | title | etc

book_categories

book_id | category_id

This is called a many-to-many relationship. You should probably google it to learn more.

answered Dec 30 '11 at 01:09

powerbuoy

12,460
7
48
78

if I only have a few categories, would creating a categories table be somewhat overkill? what about storing just a string? – Jonathan Lin Jan 31 '13 at 07:57
I would definitely still recommend using a separate table for categories. – powerbuoy Jan 31 '13 at 15:11

score 2 · Accepted Answer · edited May 23 '17 at 10:32

2

Use method 2 -- multiple rows per book, storing one category per row. It's the only way to make searching for a given category easy.

This design avoids repeating groups within a column, so it's good for First Normal Form.

But it's not just an academic exercise, it's a practical design that is good for all sorts of things. See my answer to Is storing a comma separated list in a database column really that bad?

edited May 23 '17 at 10:32

Community

1
1

answered Dec 30 '11 at 01:11

Bill Karwin

538,548
86
673
828

Yes 80% i agree with ur post linked... but as u said... sometimes maybe u need denormalisation... i know its better for alot more queries to stick to the normalisation rules , but that type of search will be very annoying to querie...anyway i gues it will be up to me to choose... and i think ill stick with method 2 like u all said. Ty – Tudor Dec 30 '11 at 01:29

score 2 · Answer 3 · answered Dec 30 '11 at 01:29

2

This relationship is a Many-To-Many (a book can have multiple categories and a category can be used in several books).

Then we have the following:

Diagram

Got it?

=]

answered Dec 30 '11 at 01:29

Vinicius Cainelli

847
6
11

score 0 · Answer 4 · answered Dec 30 '11 at 01:09

0

I would recommend approach number 2. This is because approach 1 requires a full text search of the category column.

You may have some success by splitting it up into two tables: One table has one line per book and a unique id (call the table books), and the other has one line per book per category and references the book id from the first table (call the table bookcategories). Then if you only need book data you use table books, where if you need categories you join both tables.

answered Dec 30 '11 at 01:09

Dan

10,531
2
36
55

like a said i use unique connection for some tables/ queries... joining tables ruins every bit of sercurity added...but yea ur right – Tudor Dec 30 '11 at 01:57
How does `join` ruin security? – Dan Dec 30 '11 at 01:59
lets say u have a different connection for 2 tables... by joining them u can use only 1 connection... and if that connection used for the joined querie is compromissed it will make vulnerable both tables. u understand now? – Tudor Dec 30 '11 at 02:11

Database modeling: best aproach for multiple categories for multiple elements

4 Answers4