1

What is the industry design standard for representing the relationship between two tables? Should you use a composite key, or a primary key and foreign key separated? Are there performance differences between the two options?

For example, say you have a user and that user can have zero or many orders. You have the user table with attributes PK UserID and Name. An order will NEVER exist without a User. Now, this is where the meat of my question comes in...

  1. Do you follow the first diagram, where a composite key is used between the OrderID and UserID?

  2. Or, do you follow the second diagram, where there is NO composite key and the Order is simply identified by an OrderID with a FK UserID to the User table?

ER Diagram

Corey P
  • 955
  • 1
  • 14
  • 23
  • A one to zero or many relationship, using a composite key. In this example, it may look like "PRIMARY KEY (OrderID, CustomerID)" in SQL. Where CustomerID is also a foreign key to UserID – Corey P Jan 22 '18 at 16:19
  • 1
    diagram 2 is better IMO. Suppose that an order is needed as foreign key in another table. In diagram 1 that other table also needs the column CustomerID together with OrderID. And there is no need for that when OrderID is unique – GuidoG Jan 22 '18 at 16:21
  • Sorry, I updated the diagram to make it more clear... So, the fact that an order can never exist without a User doesn't justify the use of a composite key? – Corey P Jan 22 '18 at 16:24
  • indeed it does not. Because when you do that, than all tables that need to link to table Order must also have the column UserID. And that is really not needed is it – GuidoG Jan 22 '18 at 16:26
  • Mind you that it will work in both diagrams, the second just makes life easier for you – GuidoG Jan 22 '18 at 16:28
  • Is a `CustomerID` somehow linked to a `User` in the assumed `Customer` table? If it makes more sense to link the user to an order through the customer > user relationship, then do it that way. I'd generally default to using the primary key / foreign key relationship for simplicity and readability. If there's some design or performance reason that would make a composite key more viable, then you should do that. Here's an article that touches on this topic: http://www.ben-morris.com/identity-surrogate-vs-composite-keys-in-sql-server/ – EMUEVIL Jan 22 '18 at 16:33
  • If you use a composite key in table `Order` than you have to have all those columns in any table that relates to table `Order`and thus you create a redundancy. There is no need to store `UserID`so many times over again when you already have it in table `Order` So actually you are breaking the first rule of normalization – GuidoG Jan 22 '18 at 16:36
  • See: https://stackoverflow.com/questions/762937/whats-the-difference-between-identifying-and-non-identifying-relationships – David Browne - Microsoft Jan 22 '18 at 16:43
  • 2
    There is a genuine functional difference between these two cases, not just a performance/stylistic difference. In the structure 2, OrderID is unique across the table, so once used for one customer, that order ID cannot be used for any other customer. In structure 1, it is the combination of OrderID and CustomerID that is unique, in which case you could have the same OrderID used for two different customers. – JohnRC Jan 22 '18 at 16:55
  • @JohnRC That is an excellent point that I overlooked. What's an example of two cases that could demonstrate my question? My question remains the same. – Corey P Jan 22 '18 at 17:26
  • IMHO your question mixes _logical_ and _physical_ design points. In logical design you consider only the unique object identifiers and relationship constraints imposed by the requirements of the business application. So you define _only_ unique entity identifier keys and mandatory entity relationships. In physical design the _performance_ of the system comes into play and choice of indexing depends on the dynamics of how the data is updated and queried, so you would choose a primary key, add unique or non-unique composite indexes, and maybe de-normalise just for performance reasons. – JohnRC Jan 22 '18 at 18:56
  • Another point that I think is worth considering. I reckon it is worth having an automatically populated ID column in all tables to uniquely identify every row in a table. Where a FK relationship exists, then use the unique ID of the master as the FK, because that makes the link permanent and independent of changes to symbolic keys. If you need to report the symbolic keys when reporting from the child table, join with the master table to pick them up. Make the ID column an integer type or GUID type depending on scope requirements. – JohnRC Jan 25 '18 at 15:33

1 Answers1

1

The general rule is "keep your keys as narrow as possible, unless you have an explicit requirement to expand them".

Typical situations

Following your example, when you will start designing the OrderDetails table, with the design 1 your options are:

  1. Migrate the whole PK, OrderId, UserId into the child table, or
  2. Introduce some single column alternate key into the Orders table, and reference this AK, instead of PK.

If you do need the user in the order details (which is extremely unlikely), the former schema is fine, but you probably don't. In addition, if you will need to implement a functionality like "change ownership of an order", with the first approach you will discover that the task suddenly becomes much more convoluted than it normally should be.

As such, in the absolute majority of cases, the latter design is a way to go.

Extreme cases

Suppose you have some peculiar business requirement, say:

For every user, orders' numbers should form a sequence independent from orders of any other users.

Looks like a reasonable justification for design 1, isn't it? Well, yes and no. Sure, you have to create a composite key comprised of UserId, OrderId so that order numbers can be reused by different users, however:

  1. Such a key doesn't have to be a primary key. Also, you don't have to reference it by any of the foreign keys from child tables. Just create a surrogate Id bigint identity(1,1) primary key and reference it from anywhere.
  2. The rule "don't update keys, or at least don't reference updateable keys" still stands. If the key is potentially updateable, it's a very poor candidate for being referenced by a foreign key.
Roger Wolf
  • 7,307
  • 2
  • 24
  • 33