31

Why is the rejection of composite keys in favor of all tables using a single primary key named id? Cause generally all ORM follow this.

EDIT

I just started learning ruby on rails and in the book of agile development by pragmatic there is a line:--- Rails really don't work too well unless each table has a numeric primary key. It is less fussy about the name of the column. Same kind of line I read when I was learning Doctrine.

EDIT2 Please check this link too. I am getting more and more confused about this thing:--- Composite primary keys versus unique object ID field

From the above link:--

*the primary key should be constant and meaningless; non-surrogate keys usually fail one or both requirements, eventually

If the key is not constant, you have a future update issue that can get quite complicated if the key is not meaningless, then it is more likely to change, i.e. not be constant; see above

Take a simple, common example: a table of Inventory items. It may be tempting to make the item number (SKU number, barcode, part code, or whatever) the primary key, but then a year later all the item numbers change and you're left with a very messy update-the-whole-database problem...

EDIT: there's an additional issue that is more practical than philosophical. In many cases you're going to find a particular row somehow, then later update it or find it again (or both). With composite keys, there is more data to keep track of and more constraints in the WHERE clause for the re-find or update (or delete). It is also possible that one of the key segments may have changed in the meantime!. With a surrogate key, there is always only one value to retain (the surrogate ID) and by definition, it cannot change, which simplifies the situation significantly.*

Mohit Jain
  • 43,139
  • 57
  • 169
  • 274
  • 4
    Can you please cite your sources? I use both: composite keys and sequence-based ID primary keys. Sometimes one is more appropriate than the other. – FrustratedWithFormsDesigner Apr 19 '10 at 14:08
  • What ORM are you using which advocates always using simple keys over composite keys? I've never encountered an ORM that didn't play nicely with composite keys. To my knowledge there is no good reason to give e.g. a junction table a surrogate key of its own. – Iain Galloway Apr 19 '10 at 15:42
  • @FrustratedWithFormsDesigner @Iain Galloway I read this while learning Doctrine and now i encountered the same thing while learning Ruby on rails. Let me know if u want the exact page number of book where this line is written.. I m beginner So i m asking for the real thing. Not cross questiong or debating on this. Means why they are using this ID field . ?? – Mohit Jain Apr 20 '10 at 13:28
  • In that case, I think the statement "primary keys are better than composite keys" *might* be true in the context of Ruby-on-Rails and Doctrine. This is probably because of the underlying architectures (but I'm not an expert on Rails or Doctrine so I can't safely comment further;) ), but the statement should *not* be assumed to automatically apply to other environments/ORMS. – FrustratedWithFormsDesigner Apr 20 '10 at 13:51
  • @FrustratedWithFormsDesigner take a look on the edit portion.. Check that link. – Mohit Jain Apr 21 '10 at 05:40

9 Answers9

39

I don't think there is a blanket statement that you should only ever use a single primary key named id.

Most people use a surrogate primary key as an auto generate int, because it isolates the primary key from ever needing to be changed, like if you make the PK the user name and they later changed their legal name. You would have to update the PK and all FK columns to reflect the new name. if you had used a surrogate primary key, you just update the user's name in one spot (because the tables join on the int not the name).

The size of a primary key is important because the PK is duplicated into every index you build on the table. If the PK is large (like a string) you have fewer keys per page in the index and the index will take more cache memory to store it. Ints are small.

Having a auto increment int PK lends itself to being a clustered index well, as rows are stored in this order and there is no need to go back and bump rows out of the way to insert a new row, you always add to the table's end.

almog.ori
  • 7,839
  • 1
  • 35
  • 49
KM.
  • 101,727
  • 34
  • 178
  • 212
  • Sequential guids work really well too, especially when you need to merge data sets from multiple databases into a single table. – Juliet Apr 19 '10 at 14:17
  • @Juliet, but they are large and waste index space. If possible it is best to combine a auto increment int ID with another column with identifies the database/company/etc (like a 1 or two byte int) where that row came from because it can use much less space than the guid, yet still be unique across all databases. – KM. Apr 19 '10 at 14:20
  • 4
    I wouldnt recommend the use of guids except for edge cases – James Westgate Apr 19 '10 at 14:32
  • @KM I just started learning ruby on rails and in the book of agile develepment by pragmatic there is a line:--- Rails really doesn’t work too well unless each table has a numeric primary key. It is less fussy about the name of the column. Same kind of line i read when i was learning Doctrine. – Mohit Jain Apr 20 '10 at 13:16
18

The only real limitation that I have run into using composite keys regards using an IN expression with a subquery. This is a problem, because a subquery in an IN expression must return a single column (at least in T-SQL).

SELECT
    emp.Name,
    emp.UserDomain,
    emp.UserID
FROM
    employee emp
WHERE
    ???? IN (SELECT e.UserDomain, e.UserID FROM ... /* some complex 
                                                       non-correlated subquery 
                                                       or CTE */
            )

There are always work-arounds, of course, but sometimes it could be an annoyance.

This is hardly a reason to avoid a composite key in places where it makes sense to use one.

Jeffrey L Whitledge
  • 58,241
  • 9
  • 71
  • 99
10

You can use both. In some cases when making an association between an entity you can use both entity keys as a composite key.

As a rule of thumb I use generated ids for entities and composite keys for relationships.

James Westgate
  • 11,306
  • 8
  • 61
  • 68
8

Well, it's basically about keeping JOINs simple - which one is simpler to understand:

SELECT
   p.ID, p.Name, p.City,
   c.ID, c.Country, c.ISOCode
FROM
   dbo.Parent p
INNER JOIN
   dbo.Child c on c.ParentID = p.ID

or

SELECT
   p.ID, p.Name, p.City,
   c.ID, c.Country, c.ISOCode
FROM
   dbo.Parent p
INNER JOIN
   dbo.Child c ON c.ParentName = p.Name
     AND c.ParentCity = p.City
     AND c.ParentCountry = p.Country

If you have composite primary keys, anyone joining to your table from a child table must "drag along" all those columns, and all those columns are also going to be present in the child table and the JOIN statements are pretty messy. Much better to have a single (even surrogate) key for the JOIN!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • Would you recommend using a surrogate key (rather than a composite key) as the PK for a junction table representing a many-many relationship? If so, why? – Iain Galloway Apr 20 '10 at 11:50
  • 1
    @Iain Galloway: not necessarily - junction tables for many-to-many relationships are one of the exceptional cases where you'd probably use a composite PK to your advantage. But there's no harm (except for extra columns) in adding a surrogate PK even in this case - it would allow you to more easily manage (e.g. delete) entries from that junction table – marc_s Apr 20 '10 at 12:17
  • Also if your junction table ever grows to become something more than just a junction, taking on attributes and properties of it's own, having a single surrogate primary key can help. This is especially true if you hang additional tables off of the junction table... if you kept a composite key for the PK, then all of those OK columns would have to be replicated down to those additional tables. With a single PK, child tables simple add a single FK column to the single PK column of the junction table. A single PK column can help support schema changes and DB design growth better over time. – ryancdotnet Apr 27 '21 at 02:15
5

I worked on an app with a 11 column primary key. It was always great fun retyping the list over and over and over every time I wanted to guarantee I was updating one row. It was a driver of bugs, MS-Access couldn't cope with more than 10 columns in a PK, etc.

Large composite keys are design smells that mean the table holds heterogenous entities or the designer wasn't really sure what it is that is unique about each entity. (Like assuming that hair color, eye color and body weight should be enough to unique identify an employee-- which isn't a good key because you'd need more and more and more columns to make it work and eventually that will include fields that are volatile and change a lot, like weight, or for some people hair color or lack there of.)

MatthewMartin
  • 32,326
  • 33
  • 105
  • 164
4

Although I agree with most of the reasons given by other respondents, my primary reason for preferring a single-column integer key is that it makes writing a user interface much, much easier.

If you are using some kind of list control to represent your data (a list, list view, combo box, etc) you can uniquely relate each entry back to its database representation through a single integer value stored with the item. Most pre-written components already allow you to attach an integer to each item and for those that don't, it's very easy to extend the component to do so.

If you're passing data between a server application and a web page, it's much easier to store the single identifying value in the the id attribute of the widget that represents the data than to have to compose and parse multi-value ids.

Larry Lustig
  • 49,320
  • 14
  • 110
  • 160
1
  1. For an ORM a single identifying column with a consistent name like table_id is easier than a composite key. But every good ORM support composite keys.

  2. An simple PK can easily be 'autoincremented' by the database. This doesn't hold for a composite key.

  3. A simple PK is also easier to use in queries. When you need to join, you only have to use one column of both relations.

This is not to say that simple PKs are better than composite ones, though.

Exception e
  • 1,864
  • 3
  • 19
  • 33
0

Objects in OO programming have identity regardless of their contents. Rows (tuples) in relational databases are identified by their contents only. So when really doing ORM, i.e. mapping objects from an object-oriented programming language to a relational database, an extra ID must be provided as distinct from the fields and/or properties the object has in the program - unless one or more of those are somehow known to identify objects uniquely.

reinierpost
  • 8,425
  • 1
  • 38
  • 70
  • You certainly require a *primary key* on your table, but there is no reason this has to be a single column. Composite primary keys are very very common, especially on junction tables. – Iain Galloway Apr 19 '10 at 15:36
  • Please read again. I didn't say primary keys are required in relational database schemas (they aren't), nor that composite keys do not arise in designing relational database schemas (they do). I just wanted to say that if your relational database model is blindly *generated* from an OO class model, all tables will need to have surrogate IDs as their primary keys that do not correspond to any of the properties of the object classes. Junction tables will not be generated unless you treat collection-valued properties in a special way. – reinierpost Apr 20 '10 at 15:18
  • Do you have a special way of representing a m:m relationship without using junction tables? – Iain Galloway Apr 22 '10 at 07:38
  • Can you give an example of an m:n relationship in a class model? – reinierpost Apr 22 '10 at 11:32
  • High-level example: interface IGroupable where T:IGroup { ICollection Groups { get; } } interface IGroup where T:IGroupable { ICollection Items { get; } } – Iain Galloway Apr 22 '10 at 13:23
  • OK, thanks. So if you treat collection-valued properties in a special way, you will generate junction tables for them. Now please read again: "Junction tables will not be generated unless you treat collection-valued properties in a special way." Something serious seems to be blocking our communication here. – reinierpost Apr 23 '10 at 10:21
  • Sure, I'm missing you somewhere. You say I'm treating collections "specially", but I don't understand what your alternative is. How would you treat "collection-valued properties" such as those I suggested in such a way as to not require a junction table? – Iain Galloway Apr 23 '10 at 12:10
  • If you don't treat them specially, and translate them in a generic way, you won't end up with junction tables. The generic translation will translate each class as a table with an ID column plus one column for each field, so the exact columns will depend on the collection's internal implementation, e.g. a linked list will have Current and Next columns, an array Index and Value. The ID is used as the property's value. – reinierpost May 06 '10 at 12:34
0

Your question is strongly related to the surrogate (or artificial) keys vs natural keys alternative. I think it's not that composite keys are less used, but that natural keys (be them composite or simple) are less favoured than artificial keys.

Traditional relational database theory dealt mostly with "natural" keys, (the ones which have meaning from the business-domain point of view) and in that scenario composite keys are frequently found... naturally.

But in the later years, database design has favoured (though not exclusively) the "artificial" (surrogate) key pattern, typically a sequential number that has no business meaning, only serves to uniquely identifies the record in the table (and perhaps the object in the upper layer).

Community
  • 1
  • 1
leonbloy
  • 73,180
  • 20
  • 142
  • 190
  • 1
    Composite keys are still found "naturally" when using surrogate keys. Surrogate vs Natural keys is a completely separate debate! – Iain Galloway Apr 19 '10 at 15:38
  • Read the question: "Why is the rejection of composite keys in favor of all tables using a single primary key named id." The 'single primary key named id' is precisely a surrogate key. – leonbloy Apr 19 '10 at 15:57
  • 1
    I read the question carefully. He doesn't mention natural keys at all. The debates (natural vs surrogate and simple vs composite) are completely separate. Composite keys frequently occur in the absence of natural keys. For a relevant example see e.g. http://megocode3.wordpress.com/2008/01/04/understanding-a-sql-junction-table/ Can you think of a good reason to give that junction table a surrogate key of its own? – Iain Galloway Apr 19 '10 at 16:24
  • First, the surrogate vs natural "debate" IS certainly relevant for the question (take a look at the accepted answer) and is mandatory reading for the poster. Second, a joint table like that of your example certainly has a "natural" (in both senses of the word) PK, it does not need a surrogate key at all; however, the fact is that nowadays many people have the (good or bad) habit of "universal surrogate PK design", using an "id" PK (surrogate) for ALL tables (as the question notes); so that, even in your example, the "natural" key would not be used as a PK - just a UNIQUE restriction. – leonbloy Apr 19 '10 at 19:24
  • Certainly I have seen a movement towards using an autonumber PK for all tables, including junction tables. The OP has seen this too and is asking *why*. – Iain Galloway Apr 20 '10 at 11:45