Why use multiple columns as primary keys (composite primary key)

Question

This example is taken from w3schools.

CREATE TABLE Persons
(
    P_Id int NOT NULL,
    LastName varchar(255) NOT NULL,
    FirstName varchar(255),
    Address varchar(255),
    City varchar(255),
    CONSTRAINT pk_PersonID PRIMARY KEY (P_Id,LastName)
)

My understanding is that both columns together (P_Id and LastName) represent a primary key for the table Persons. Is this correct?

Why would someone want to use multiple columns as primary keys instead of a single column?
How many columns can be used together as a primary key in a given table?

...now there is also an [answer for the 2'nd question](http://stackoverflow.com/a/41741054/2932052) — Wolf, Jan 19 '17 at 11:58

score 132 · Accepted Answer · edited Aug 29 '17 at 10:15

132

Your understanding is correct.

You would do this in many cases. One example is in a relationship like OrderHeader and OrderDetail. The PK in OrderHeader might be OrderNumber. The PK in OrderDetail might be OrderNumber AND LineNumber. If it was either of those two, it would not be unique, but the combination of the two is guaranteed unique.

The alternative is to use a generated (non-intelligent) primary key, for example in this case OrderDetailId. But then you would not always see the relationship as easily. Some folks prefer one way; some prefer the other way.

edited Aug 29 '17 at 10:15

Vadim Ovchinnikov

13,327
5
62
90

answered Apr 12 '10 at 23:53

MJB

7,639
2
31
41

2

Is this useful if i am using branch_id and using replication between two databases, will solve duplicate of ids ?!! – Mhmd May 24 '14 at 06:27
14

Note that in many cases of using a generated primary key, you often still want a unique key on the composite values. – Bacon Bits Sep 11 '16 at 12:41
Please elaborate on "Some folks prefer one way; some prefer the other way" . – Username May 10 '20 at 16:45
2

Pleas elaborate? Not sure what to say. I have known people that prefer to have multiple concatenated fields as a key because it is easier intuitively to understand what they are looking at. I have known others that prefer just assigning a unique key to each row because it is easier and quicker to type. Is that what you are asking? – MJB May 11 '20 at 14:02
That message was meant for @Username. I forgot to direct it. – MJB May 11 '20 at 17:30

score 28 · Answer 2 · edited Sep 11 '16 at 12:43

Another example of compound primary keys are the usage of Association tables. Suppose you have a person table that contains a set of people and a group table that contains a set of groups. Now you want to create a many to many relationship on person and group. Meaning each person can belong to many groups. Here is what the table structure would look like using a compound primary key.

Create Table Person(
PersonID int Not Null,
FirstName varchar(50),
LastName varchar(50),
Constraint PK_Person PRIMARY KEY (PersonID))

Create Table Group (
GroupId int Not Null,
GroupName varchar(50),
Constraint PK_Group PRIMARY KEY (GroupId))

Create Table GroupMember (
GroupId int Not Null,
PersonId int Not Null,
CONSTRAINT FK_GroupMember_Group FOREIGN KEY (GroupId) References Group(GroupId),
CONSTRAINT FK_GroupMember_Person FOREIGN KEY (PersonId) References Person(PersonId),
CONSTRAINT PK_GroupMember PRIMARY KEY (GroupId, PersonID))

great explanation: I think the need of attributes for m-to-n relations (in a normalized fasion) is the key. — Wolf, Jan 19 '17 at 11:50

score 9 · Answer 3 · edited May 23 '17 at 12:34

9

The W3Schools example isn't saying when you should use compound primary keys, and is only giving example syntax using the same example table as for other keys.

Their choice of example is perhaps misleading you by combining a meaningless key (P_Id) and a natural key (LastName). This odd choice of primary key says that the following rows are valid according to the schema and are necessary to uniquely identify a student. Intuitively this doesn't make sense.

1234     Jobs
1234     Gates

Further Reading: The great primary-key debate or just Google meaningless primary keys or even peruse this SO question

FWIW - My 2 cents is to avoid multi-column primary keys and use a single generated id field (surrogate key) as the primary key and add additional (unique) constraints where necessary.

edited May 23 '17 at 12:34

Community

1
1

answered Apr 14 '10 at 22:33

Robert Paulson

17,603
5
34
53

1

1) the "great primary key debate" link is particularly stupid, the info is self-serving and false. 2) The index on the columns that make the row unique cannot be avoided. A "surrogate" ID with an index is always an additional column and an additional index. Rather silly because it is redundant. And slower. – PerformanceDBA Feb 17 '19 at 12:57
2

The "great primary key debate" is not stupid. It's a very valid issue from developers that are not sql developers or sql DBA's and don't spend all their time in sql. Even in pure sql I'd rather have a meaningless auto-generated key as the primary key when joining than have to remember to pass n bits of data around being the natural key. You're welcome to your viewpoint, but we'd appreciate not being so dismissive. – Robert Paulson Feb 20 '19 at 21:04

nvogel · Answer 4 · 2010-11-23T14:28:37.587

4

You use a compound key (a key with more than one attribute) whenever you want to ensure the uniqueness of a combination of several attributes. A single attribute key would not achieve the same thing.

edited Nov 23 '10 at 14:28

answered Nov 23 '10 at 14:09

nvogel

24,981
1
44
82

1

As for ensuring a unique key, you might rely on the combination of two attributes to form a key that logically cannot be duplicated, Person and graduation date from a larger dataset would be an example. – John Mark Nov 08 '15 at 06:13

score 2 · Answer 5 · answered Apr 12 '10 at 23:54

Yes, they both form the primary key. Especially in tables where you don't have a surrogate key, it may be necessary to specify multiple attributes as the unique identifier for each record (bad example: a table with both a first name and last name might require the combination of them to be unique).

score 2 · Answer 6 · edited Feb 13 '19 at 12:58

Your second question

How many columns can be used together as a primary key in a given table?

is implementation specific: it's defined in the actual DBMS being used.^[1],[2],[3] You have to inspect the technical specification of the database system you use. Some are very detailed, some are not. Searching the web about such limitations can be hard because the terminology varies. The term composite primary key should be mandatory ;)

If you cannot find explicit information, try creating a test database to ensure you can expect stable (and specific) handling of the limit violations (which are to be expected). Be careful to get the right information about this: sometimes the limits are accumulated, and you'll see different results with different database layouts.

score 2 · Answer 7 · answered Nov 22 '10 at 20:16

2

Multiple columns in a key are going to, in general, perform more poorly than a surrogate key. I prefer to have a surrogate key and then a unique index on a multicolumn key. That way you can have better performance and the uniqueness needed is maintained. And even better, when one of the values in that key changes, you don't also have to update a million child entries in 215 child tables.

answered Nov 22 '10 at 20:16

HLGEM

94,695
15
113
186

1

1) Performance. Not in an SQL platform (maybe in the pretend "sql"s and freeware). 2) Preference is irrelevant. What the tables require, for integrity, is relevant. 3) A "surrogate" ID with an index is always an **additional** column and an **additional** index. So that would be slower, on any platform. Re performance, you contradict yourself. 4) If you don't know how to update the mythical "million child entries in 215 child tables" *properly*, ask a question. – PerformanceDBA Feb 17 '19 at 11:45
2

I disagree with the statement 'Multiple columns in a key are going to, in general, perform more poorly than a surrogate key'. Often an extra query is required to get the surrogate key of a relationship when you consider it. At which point it's a full extra round trip slower performance wise. – ttugates Apr 19 '19 at 20:50

score 1 · Answer 8 · answered Sep 11 '16 at 12:02

Using a primary key on multiple tables comes in handy when you're using an intermediate table in a relational database.

I'll use a database I once made for an example and specifically three tables within that table. I creäted a database for a webcomic some years ago. One table was called "comics"—a listing of all comics, their titles, image file name, etc. The primary key was "comicnum".

The second table was "characters"—their names and a brief description. The primary key was on "charname".

Since each comic—with some exceptions—had multiple characters and each character appeared within multiple comics, it was impractical to put a column in either "characters" or "comics" to reflect that. Instead, I creäted a third table was called "comicchars", and that was a listing of which characters appeared in which comics. Since this table essentially joined the two tables, it needed but two columns: charname and comicnum, and the primary key was on both.

score 0 · Answer 9 · answered Aug 08 '19 at 20:14

We create composite primary keys to guarantee the uniqueness column values which compose a single record. It is a constraint which helps prevent insertion of data which should not be duplicated.

i.e: If all student Ids and birth certificate numbers are uniquely assigned to a single person. Then it would be a good idea to make the primary key for a person a composition of student id and birth certificate number because it would prevent you from accidentally inserting two people who have different student ids and the same birth certificate.

Why use multiple columns as primary keys (composite primary key)

9 Answers9

Linked