Good question. Your observation about storing a large number of similar values is valid.
A solution to optimise your design is to use relational table schema. With relational tables, you take common values in fields and move them to another table, then create a link to the table.
As an (unrelated) example, instead of storing the country name in a field a million times, create a field that points to a list of countries. The benefits will be that you will store (a million times) a few bytes for a number compared to (a million times) a few hundred bytes for the name.
The benefits are greater than storage, as it is more efficient to compare numbers than strings. For example, it takes a computer ONE comparison to check if ($val == 10), compared to one check for every letter in a string (imagine running check if ($val = 'a very long string') a million times.
http://en.wikipedia.org/wiki/Relational_database
As an example. assume you have a limited number of keywords.
CREATE TABLE video
(
video_id int(11) NOT NULL auto_increment,
keyword_id int(11),
website varchar(255),
url string,
PRIMARY KEY (video_id)
);
CREATE TABLE keywords
(
keyword_id int(11) NOT NULL auto_increment,
keyword_name varchar(255),
PRIMARY KEY (keyword_id)
);
Notice that the video table has a field keyword_id, instead of keyword_name, so this will store a number, instead of a string. Examples of your data will be
- video
id keyword_id url
1 1 http://domain1/path1/
2 2 http://domain2/path1/
3 2 http://domain3/path4/
:
- keywords
keyword_id keyword_name
1 short keyword
2 a long key that has many, many, many characters
If you know the keyword id, then it is easy to search for the videos.
SELECT keyword FROM keywords WHERE keyword_name = 'car';
:
SELECT * FROM video where keyword_id = ':keyword_id';
or
SELECT keyword_name, website, url
FROM video
JOIN keyword ON keyword_id
WHERE keyword_name = 'car'
I also came across a few posts on SO.
Relational Database Design Patterns?
Relational Database and Normalization for Relational Tables
Best way with relation tables