1

The general form of a many-to-many relationship is to create three two-column tables (e.g. to relate favorite movies of members). Two tables are list of movies and members. In these two tables, we assign an auto_increment ID (which is primary key).Then storing relationship of these two IDs in the third table.

I wonder why we use ID for these two columns? We can have two one-column tables as lists of movies and persons; and then create the relationship table relating movies to persons as:

Fred   Gladiator
Brian  Godfather
Fred   Godfather

The only disadvantage of this method is that searching integer-only table is easier to find relationships. Instead we have reduced two indexed columns.

Any idea?

Googlebot
  • 15,159
  • 44
  • 133
  • 229

3 Answers3

1

In any real world scenario the two tables (movies and persons) will have a lot more columns (like movies.year, persons.lastname, persons.firstname etc). In this case it makes sense to have a single primary key column in each of the tables (which could as well be a non auto-increment column e.g. movies.name) for the many to many relation.

But if the situation is as trivial as you suggest then your solution should be good enough. I don't think the disadvantage you mention will really be a concern.

Aditya

Aditya Naidu
  • 1,362
  • 9
  • 12
1

I see a number of potential issues:

  • How do you distinguish between multiple Freds?
  • What happens if a movie or person changes name (pretty common when a movie is in development)? Now you have to go and make sure every table that references it is updated, whereas an ID would remain constant.
ceejayoz
  • 176,543
  • 40
  • 303
  • 368
  • Really nice points. Most of times we deal with unique values (e.g. username instead of name); but the second issue is absolutely correct. – Googlebot Oct 25 '11 at 02:08
1

I wonder why we use ID for these two columns?

I think it's usually cargo cult programming.

  • If you can't distinguish one Fred from another without an arbitrary, meaningless id number, then you can't distinguish one Fred from another with an arbitrary, meaningless id number.
  • If you anticipate changes to either of the names, then you should cascade updates. (Many Oracle developers insist that keys must be immutable. That's not because mutable, natural keys are "bad"; it's because Oracle doesn't support ON UPDATE CASCADE.)
  • There's a widespread--but incorrect--belief that id numbers are simply "faster", possibly because dbms engines are "optimized for joins on id numbers". Whether id numbers are faster in joins depends on the width of your tables, the content of the natural key, the number of rows, the number of joins required, page size, and the nature of your queries. What's often overlooked is that using natural keys usually reduces the number of joins. Sometimes, natural keys eliminate all the joins, even in tables that are in 5NF.

See this recent SO answer for some measurements.

Don't guess. Measure.

Community
  • 1
  • 1
Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185