8
  • I understand the value of primary keys.
  • I understand the value of indexes.

Should every MySQL table have an auto-incremented primary key (ideally with INT field type)?

Update

@Raj More's answer seems most efficient. The issue when I think about it, however, is how this auto-incremented primary key ID will relate to other tables. For example:

table 1

ID  |   firstname  |   lastname | email
----------------------------------------
1   |    john      |   doe      | 1@email.com
2   |    sarah     |   stow     | 2@email.com
3   |    mike      |   bro      | 3@email.com

table 2

ID  | memberid |    display      |    address
--------------------------------------------
1   |    1     | funtime zone    |   123 street
2   |    3     | silly place llc |  944 villa dr 

In the example above, a consumer may come to the site and choose to register for a free product/service. If the consumer chooses, they are able to give additional information (stored in table 2) for additional mailing, etc. The problem I see is in how these tables relate to the 'primary key auto-incremented field'. In table 2, the 'memberid' relates to table 1's ID but this is not 'extremely' clear. Any new information placed into table 2 will increment by 1 whereas not all consumers will choose to participate in the table 2 required data.

JM4
  • 6,740
  • 18
  • 77
  • 125
  • possible duplicate of [Surrogate vs. natural/business keys](http://stackoverflow.com/questions/63090/surrogate-vs-natural-business-keys) – ypercubeᵀᴹ Sep 07 '11 at 22:21
  • @ypercube - wow really? You list as a possible duplicate of an example given from another answer? The first commenter made a similar assessment, voted for close, then removed his own comment. – JM4 Sep 07 '11 at 22:24
  • Why does that sounds strange? ("*Exact duplicate: This question covers exactly the same ground as earlier questions on this topic; its answers may be merged with another identical question.*") – ypercubeᵀᴹ Sep 07 '11 at 22:35
  • 1
    My answer will be deleted shortly. My point is that there are many pros and cons for either options (which were discussed in other similar questions). Saying "I like this, I choose Yes, Read this" is not an answer really. – ypercubeᵀᴹ Sep 07 '11 at 22:51
  • Your update is a great example of a table(2) (which has a 1:1 relation to another one) where an auto-incremented Primary Key is useless. The `memberid` can be the PK (not auto incremented as it also an FK to table1). – ypercubeᵀᴹ Sep 07 '11 at 22:56
  • @ypercube - thanks for the help. In the example of table(2) above, do you see value in creating an auto-incremented value (or the ID field in general) at all? – JM4 Sep 07 '11 at 23:02
  • No, not at all. The `memberid` can be both the `PRIMARY KEY` and `FOREIGN KEY` that `REFERENCES table1(ID)`. No need to add another unique constraint and index. – ypercubeᵀᴹ Sep 07 '11 at 23:08
  • @ypercube - thanks. All of my tables currently with MyISAM engine; I'll have to look into switching to InnoDB for Foreign Key usage. – JM4 Sep 07 '11 at 23:32
  • What you're showing is a 1:1 relationship - I don't think I can recall an instance of ever having built one. If you change the same thing to a 1:n relationship, you may see a larger point in having surrogate keys. – Raj More Sep 08 '11 at 12:40
  • @Raj: So, what would you choose for a PK in a table with a 1:1 relationship? – ypercubeᵀᴹ Sep 08 '11 at 15:19
  • I don't use 1:1 relationship tables (don't remember the last time I needed one). If I did use one, however, I would reuse the PK from the parent. – Raj More Sep 08 '11 at 16:55
  • @Raj More - You response indicates you have tables with perfectly matching data. My example shows a scenario you wouldnt. You imply you build tables & add new fields if needed rather than create a new relational table? E.g. you build an enrollment form to capture information required at one point. Later, you decide you want to capture additional information such as 'gender' and a few other things. Instead of having hundreds/thousands of null rows, why NOT build a new table with the new fields and populate based on the presence of, then run a JOIN (left or inner) to retrieve full records? – JM4 Sep 08 '11 at 17:56
  • Why is it an issue to have hundreds or thousands of rows that a NULL value in a column? – Raj More Sep 08 '11 at 18:02

3 Answers3

8

I am not a huge fan of surrogate keys. I have yet to see a scenario where I would prefer to use one for every table of a database.

I would say No.

Read up on this answer: surrogate-vs-natural-business-keys


The above may be seen as sarcastic or flaming (despite the surprisingly many upvotes) so it's deleted.

In the general case, there have been many questions and answers on surrogate and natural keys so I felt this question is more like a duplicate. My view is that surrogate keys are fine and very useful, mainly because natural keys can lead to very big primary keys in the low end of a chain of connected tables - and this is not handled well by many RDBMS, clustered indexes get big, etc. But saying that "every MySQL table should have an auto-incremented primary key" is a very absolute statement and I think there are cases when they really offer little or nothing.

Since the OP updated the question, I'll try to comment on that specific topic.

I think this is exactly a case where an autoincrementing primary key is not only useless but adds negative value. Supposing that table1 and table2 are in 1:1 relationship, the memberid can be both the Primary Key and a Foreign Key to table1.

Adding an autoincrementing id column adds one index and if it's a clustered one (like InnoDB PK indexes) increases the size of the memberid index. Even more, if you have such an auto-incrementing id, some JOIN of table2 to other tables will have to be done using this id (the JOINs to tables in 1:n relation to table2) and some using memberid (the JOINs to tables in 1:n relation to table1). If you only have memberid both these types of JOINs can be done using memberid.

Community
  • 1
  • 1
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
  • I anticipate this may get downvoted based on sarcasm but note I did not. I have read through the provided link (given by Raj More above) a couple times and think it does hold some good intrinsic value (some concepts over my head as I am new to databases development). I have added an update which you may be able to provide commentary on. – JM4 Sep 07 '11 at 22:49
  • you (or maybe I myself realized, ha) make a good point actually when thinking about useless data. The only **benefit** of something like an ID in table 2 above I can see is simply to see when something was inserted and in what order (though a date can accomplish this as well). – JM4 Sep 07 '11 at 23:05
  • Well, then it's not an `ID(dentifier)` but an ordering. And there are many cases where autogenerated IDs do not have the same order as the order of insertion (UUIDs, concurrent threads, master-slave dbs and other scenarios). – ypercubeᵀᴹ Sep 07 '11 at 23:19
3

I am a huge fan of surrogate keys. I have yet to see a scenario where I would prefer not use one.

I would say Yes.

Read up on this answer Surrogate vs. natural/business keys

Edit

I will change my answer to include the following:

There are certain scenarios that I now use the actual value as a surrogate key:

DimDate (20151031, 20151101, 20151102....) DimZipCode (10001, 10002, 10003...)

Everything else gets Surrogate Keys.

Community
  • 1
  • 1
Raj More
  • 47,048
  • 33
  • 131
  • 198
  • For every table? A table `Country` with `Code, Name` as fields for example would be better defined with `Code CHAR(2) Primary Key`. – ypercubeᵀᴹ Sep 07 '11 at 22:08
  • 1
    If I had bad data for a country code and it needed to be changed in the dimension, then I would have to update my entire Fact table, and my indexes on the fact would possibly be out of sync. Plus I'd rather not cascade updates – Raj More Sep 07 '11 at 22:11
  • @Raj More - added supplemental update which I think ultimately highlights my question. In the example, the 'auto-incremented' ID of each table will differ even though the actual information is relational. As more and more tables are added, it seems the 'auto-incremented' value becomes futile and simply a row count. – JM4 Sep 07 '11 at 22:46
  • 2
    If you have bad data then that's the best reason you can think of to simply not admit them to enter your database. If someone tells you he lives in the country with ISO code XQ, then do you conclude that your list of ISO codes is out of date or do you conclude that that person is a liar ? – Erwin Smout Sep 07 '11 at 23:03
1

Yes, with one exception:

A table which implements a n:m relationship between two other tables, a pure link table, does not need such a field if it has two fields only, referencing the two primary keys of the linked tables. The primary key of the link table then consists of the two fields.

As soon as the link table has extra information, it needs a simple single-field primary key.

Having said that, there may be more exceptions; database design is a very broad field...

EDIT: Added more in the previous sentence.

Moritz Both
  • 1,598
  • 1
  • 13
  • 21
  • If the link table is going to be used in any further relationships downstream, now you have a bigger FK (and bigger indexes). Consider going down 3 relationships of such linked tables - a FK set of 8 columns instead of 3. Personally, I'd rather have the surrogate key to begin with. – Raj More Sep 07 '11 at 22:15
  • @Raj, you are right; if the link table is going to be used in any further relationships downstream it is no longer a simple link table, it has extra information, too. I agree to add a surrogate key in this case. – Moritz Both Sep 07 '11 at 22:22
  • 1
    So there is one exception (which itself has an exception) and there may be other exceptions?! – onedaywhen Sep 08 '11 at 08:55
  • 1
    @onedaywhen: Yes, something like that. I was trying to contribute our rule of thumb and I will insert the missing word "more". This subjects seems to set free a lot of feelings, by the way :) – Moritz Both Sep 08 '11 at 09:44
  • @onedaywhen - you have voted to close a question because the answers can have exception? That makes a lot of sense.... – JM4 Sep 08 '11 at 14:36
  • @JM4: No, I guess because *"the subject seems to set free alot of feelings"*. – ypercubeᵀᴹ Sep 08 '11 at 14:51