What is the difference between a candidate key and a primary key?

Question

Is it that a primary key is the selected candidate key chosen for a given table?

CK & PK are relational database terms & have different & differing meanings in SQL. [What to do with null values when modeling and normalizing?](https://stackoverflow.com/a/40733625/3404097) — philipxy, Mar 10 '19 at 14:38

John Woo · Accepted Answer · 2012-10-10T07:05:41.937

101

Candidate Key – A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key.

Primary Key – A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key.

More on this link with example

edited Oct 10 '12 at 07:05

answered Oct 10 '12 at 06:37

John Woo

258,903
69
498
492

A CK is a set of columns that is unique *and* does not contain a smaller set of columns that is unique. An SQL PK/UNIQE declaration actually declares a superkey. – philipxy Jul 20 '17 at 19:30

Walter Mitty · Answer 2 · 2019-03-02T12:49:12.670

John Woo's answer is correct, as far as it goes. Here are a few additional points.

A primary key is always one of the candidate keys. Fairly often, it's the only candidate.

A table with no candidate keys does not represent a relation. If you're using the relational model to help you build a good database, then every table you design will have at least one candidate key.

The relational model would be complete without the concept of primary key. It wasn't in the original presentation of the relational model. As a practical matter, the use of foreign key references without a declared primary key leads to a mess. It could be a logically correct mess, but it's a mess nonetheless. Declaring a primary key lets the DBMS help you enforce the data rules. Most of the time, having the DBMS help you enforce the data rules is a good thing, and well worth the cost.

Some database designers and some users have some mental confusion about whether the primary key identifies a row (record) in a table or an instance of an entity in the subject matter that the table represents. In an ideal world, it's supposed to do both, and there should be a one-for-one correspondence between rows in an entity table and instances of the corresponding entity.

In the real world, things get screwed up. Somebody enters the same new employee twice, and the employee ends up with two ids. Somebody gets hired, but the data entry slips through the cracks in some manual process, and the employee doesn't get an id, until the omission is corrected. A database that does not collapse the first time things get screwed up is more robust than one that does.

I was totally with you until you *seemed* to suggest using a real world value as a primary key when you said "there should be a one-for-one correspondence between rows in an entity table and instances of the corresponding entity". I would strongly advise against that as there is always the possibility of an exception you never considered. For example a database designer might think "I will use a person's name as a primary key" then a person gets married or changes their name by deed poll. Or the designer uses car reg number as a PK and then somebody gets a personalised reg. etc etc — Caltor, May 25 '18 at 10:58
I was **not** suggesting the use of a natural key. That's a separate topic. What I am suggesting is that there should be a one for one correspondence between values of the primary key and instances of the entity. This can get screwed up by, duplicate data entry, for example. — Walter Mitty, May 25 '18 at 21:26

Habib · Answer 3 · 2012-10-11T04:24:15.497

12

Primary key -> Any column or set of columns that can uniquely identify a record in the table is a primary key. (There can be only one Primary key in the table)

Candidate key -> Any column or set of columns that are candidate to become primary key are Candidate key. (There can be one or more candidate key(s) in the table, if there is only one candidate key, it can be chosen as Primary key)

edited Oct 11 '12 at 04:24

answered Oct 10 '12 at 07:04

Habib

219,104
29
407
436

6

-1 All candidate keys too have the property that a value for their attribute(s) "can uniquely identify a record in the table". and in fact they **WILL** do exactly that. So THAT is NOT the distinction. – Erwin Smout Oct 10 '12 at 15:25
@erwinsmout "candidate to become primary key" means they can uniquely identify a record. Primary key is choosen among the candidate key(s). I am not really sure what you meant – Habib Oct 10 '12 at 17:09
Do you think there is a distinction between the two or do you think there is none ? – Erwin Smout Oct 10 '12 at 20:14
@ErwinSmout, I think I got your point. There can be only one primary key in the table, whereas there can be one or multiple candidate keys in the table. I have edited my answer to reflect that. Thanks for pointing it out. I left it out considering it is part of definition for the primary key, but yes it makes the answer more clear – Habib Oct 11 '12 at 04:26
Out of curiosity is it possible for a table to have no candidate keys, for example if there are duplicate records? – Celeritas Jan 24 '17 at 10:59
@Celeritas In a relation there is always a CK because the set of all columns is a superkey since they uniquely identify a tuple in a set. An SQL table can have duplicates. But it doesn't make sense to talk about relational CKs or PKs unless a table has no duplicates, because a table with duplicates isn't a relation. Also, an SQL PK/UNIQUE declaration actually declares a superkey, since it is unique but can contain a smaller set declared PK/UNIQUE. It is also a PK/CK if it doesn't. – philipxy Jul 20 '17 at 20:09

score 11 · Answer 4 · edited Jun 17 '19 at 11:38

A Primary key is a special kind of index in that:

there can be only one;
it cannot be nullable
it must be unique.

Candidate keys are selected from the set of super keys, the only thing we take care while selecting the candidate key is: It should not have any redundant attribute.

Example of an Employee table: Employee ( Employee ID, FullName, SSN, DeptID )

Candidate Key: are individual columns in a table that qualifies for the uniqueness of all the rows. Here in Employee table EmployeeID & SSN are Candidate keys.
Primary Key: are the columns you choose to maintain uniqueness in a table. Here in Employee table, you can choose either EmployeeID or SSN columns, EmployeeID is a preferable choice, as SSN is a secure value.
Alternate Key: Candidate column other the Primary column, like if EmployeeID is PK then SSN would be the Alternate key.
Super Key: If you add any other column/attribute to a Primary Key then it becomes a super key, like EmployeeID + FullName, is a Super Key.
Composite Key: If a table does not have a single column that qualifies for a Candidate key, then you have to select 2 or more columns to make a row unique. Like if there is no EmployeeID or SSN columns, then you can make FullName + DateOfBirth as Composite primary Key. But still, there can be a narrow chance of duplicate row.

score 10 · Answer 5 · answered Oct 10 '12 at 08:33

There is no difference. A primary key is a candidate key. By convention one candidate key in a relation is usually chosen to be the "primary" one but the choice is essentially arbitrary and a matter of convenience for database users/designers/developers. It doesn't make a "primary" key fundamentally any different to any other candidate key.

score 1 · Answer 6 · edited Jan 10 '14 at 04:54

1

A table can have so many column which can uniquely identify a row. This columns are referred as candidate keys, but primary key should be one of them because one primary key is enough for a table. So selection of primary key is important among so many candidate key. Thats the main difference.

edited Jan 10 '14 at 04:54

Devraj Gadhavi

3,541
3
38
67

answered Jan 10 '14 at 04:32

Mudassir

19
2

score 0 · Answer 7 · answered Oct 10 '12 at 08:01

0

Think of a table of vehicles with an integer Primary Key.

The registration number would be a candidate key.

In the real world registration numbers are subject change so it depends somewhat on the circumstances what might qualify as a candidate key.

answered Oct 10 '12 at 08:01

Hugh Jones

2,706
19
30

score 0 · Answer 8 · answered Apr 14 '19 at 13:16

0

Primary key -> Any column or set of columns that can uniquely identify a record in the table is a primary key. (There can be only one Primary key in the table) and the candidate key-> the same as Primary key but the Primary Key chosen by DB administrator's prospective for example(the primary key the least candidate key in size)

answered Apr 14 '19 at 13:16

Ahmed Nasr Elhariri

11

1

What do you think this adds to the many highly upvoted & old answers to make this "useful" per the voting arrow mouseover texts? Also, your CK sentence is not clear. Also, your PK sentence is not clear. PS In SQL: A CK is a set of unique columns with no nulls. A PK is one of the CKs that got picked to be called PK. In relational model theory: There are no nulls. A superkey is a unique set of columns. A CK is a superkey that contains no smaller superkey. A PK is one of the CKs that got picked to be called PK. So there are already lots of answer posts here with poor content & poor writing. – philipxy Apr 15 '19 at 04:33
I believe @Ahmed Nasr Elhariri has a good point - "the same as Primary key but the Primary Key chosen by DB administrator's ".... check nvogel's answer down to tie them together. – Fact Mar 05 '20 at 22:41
@Fact All this adds to nvogel's answer is something wrong: "the primary key the least candidate key in size". – philipxy Apr 29 '20 at 08:33

score -1 · Answer 9 · answered Feb 23 '14 at 08:57

A primary key is a column (or columns) in a table that uniquely identifies the rows in that table.

CUSTOMERS


CustomerNo  FirstName   LastName
1   Sally   Thompson
2   Sally   Henderson
3   Harry   Henderson
4   Sandra  Wellington

For example, in the table above, CustomerNo is the primary key.

The values placed in primary key columns must be unique for each row: no duplicates can be tolerated. In addition, nulls are not allowed in primary key columns.

So, having told you that it is possible to use one or more columns as a primary key, how do you decide which columns (and how many) to choose?

Well there are times when it is advisable or essential to use multiple columns. However, if you cannot see an immediate reason to use multiple columns, then use one. This isn't an absolute rule, it is simply advice. However, primary keys made up of single columns are generally easier to maintain and faster in operation. This means that if you query the database, you will usually get the answer back faster if the tables have single column primary keys.

Next question — which column should you pick? The easiest way to choose a column as a primary key (and a method that is reasonably commonly employed) is to get the database itself to automatically allocate a unique number to each row.

In a table of employees, clearly any column like FirstName is a poor choice since you cannot control employee's first names. Often there is only one choice for the primary key, as in the case above. However, if there is more than one, these can be described as 'candidate keys' — the name reflects that they are candidates for the responsible job of primary key.

If you're "maintaining" primary keys, you're in for a world of hurt. Your example fails to explain why the multi-column key of `FirstName`, `LastName` is problematic. Also, the reason for `CustomerNo` to even be _in_ the database is that there was (presumably) no natural candidate key on a "customer" entity - that is, you had no candidate keys **at all** until it was added; if someone is designing a table, they don't start out having a surrogate key, those are added as necessary. — Clockwork-Muse, Feb 23 '14 at 09:31

score -1 · Answer 10 · answered Nov 30 '15 at 19:55

-1

If superkey is a big set than candidate key is some smaller set inside big set and primary key any one element(one at a time or for a table) in candidate key set.

answered Nov 30 '15 at 19:55

Polish

554
1
4
18

score -1 · Answer 11 · answered Feb 08 '17 at 11:18

First you have to know what is a determinant? the determinant is an attribute that used to determine another attribute in the same table. SO the determinant must be a candidate key. And you can have more than one determinant. But primary key is used to determine the whole record and you can have only one primary key. Both primary and candidate key can consist of one or more attributes

What is the difference between a candidate key and a primary key?

11 Answers11

Linked