When should I mark a column as the primary key?

Question

I have a table like this:

// cookies
+----+---------+-------------------+------------+
| id | user_id |       token       |   expire   |
+----+---------+-------------------+------------+
| 1  | 32423   | dki3j4rf9u3e40... | 1467586386 |
| 2  | 65734   | erhj5473fv34gv... | 1467586521 |
| 3  | 21432   | 8u34ijf34t43gf... | 1467586640 |
+----+---------+-------------------+------------+

A few days that I think about it. I guess I don't need to id column. Currently id column is PK and also I have an unique-index on token column to make it both unique and fast for searching.

Now I want to know, can I remove id column from the table and make token column as the PK? Is that normal?

To be honest, I've never created a table without id column (it's always been the PK) so far, So that's weird for me to choose token column as the PK.

@GordonLinoff In my current structure `id` is the PK. is that fine? — stack, Jul 05 '16 at 14:26
Google "Natural Key vs Surrogate Key". There are advantages/disadvantages to both and which is better for your system will depend on your specific requirements. — Tom H, Jul 05 '16 at 14:26
If neither the natural key (Id) or the Surrogate Key (token) is used as a foreign key then using the Surrogate Key is a valid option. It won't make much difference performance wise. Actually, a `select * ` could be slightly faster with only a Surrogate Key as PK (less data to select). The benefit of a Natural key is also visual, since it's easier to see at which order the records were inserted (if there's no creation timestamp). — LukStorms, Jul 05 '16 at 14:39
Possible duplicate of [Surrogate vs. natural/business keys](http://stackoverflow.com/questions/63090/surrogate-vs-natural-business-keys) — Jocelyn, Jul 05 '16 at 14:42

score 2 · Accepted Answer · answered Jul 05 '16 at 16:45

To the extent that token is a wide varchar, I would stick with the AI int PK that you already have. Joins will be faster. So too will inserts. Updates would likely be the same speed, as, why would that column be updated thus forcing index tree changes. But, inserts are faster for the child relationships by not dragging the wide varchar into an index tree.

It comes down to preference and readability too. As for readability, there is little of that with such a varchar. It is not as if it is a category like 'shoes'. It is a miserable unreadable non-human form. So as for readability, there is little argument for having the token as PK. Granted though, at times, it may be slightly useful.

Additional composites (multi-column indexes)

When you start combining the PK of choice with other fellow columns in composites (additional indexes you may choose to have), the thin int will become very apparent to be the best choice. Even with moderately large datasets.

Mosia Thabo · Answer 2 · 2019-11-22T00:48:37.413

3 Years later after you posted this question I felt that I had to say something. Taking into account that this question pops up when other developers with the same problem as yours struggle to make a decision on such a structured table.

Without saying much, I want to give you a scenario: Imagine you had a task to check manually if these two values are the same value1 = 1223611547921cvdfr and value2 = 1223611547921cvdfr... What would you do? Well the right step would be to compare each character between the two values from start to finish. And then *seconds later when you're done, you say, well these values are the same.

But what happens if you were to compare these two values value1 = 2 and value2 = 2? Within a split second you say "they're the same".

Same scenario occurs with computers complicated values lead to longer comparison or load time. Often this time is little to notice but what happens if you run a site like facebook where Billions of Users log in online everytime?

So it's quicker to SELECT user_id WHERE id = "1" than SELECT user_id WHERE token = "dki3j4rf9u3e40...".

Normally though what I'd do is add a security key, more like a Guid or token that must be passed together with the Id on the request before user information is returned from the server. So say for example, you get the user by id and then confirm that the user has the same security key as the one provided. Otherwise, return not-found.

Sometimes it's about simplicity, and simplicity is the preferred way in modern programming. I have seen developers using wrong methods while they keep saying "it's my preference". Often what you prefer leads to bad coding, it's important to remain vigilant and look for Modern Design Patterns as Techs evolve.

score 1 · Answer 3 · answered Jul 05 '16 at 14:46

In general we often prefer Id of the table to be the Primary key, but what primary says is that it should be not null and should uniquely identify the rest records of the table (Columns) , so if you want to make the token as a primary key you can easily make it, but make sure it (Id) should not be depend on the other tables. so whenever you will have to fetch any record you can easily fetch it by using the token.

When should I mark a column as the primary key?

3 Answers3