Questions tagged [surrogate-key]

A key in a database table, that has no intrinsic logical meaning, and was introduced for better physical organization of the database or other technical reasons.

Terminology

  • A "superkey" is any set of attributes that, when taken together, uniquely identify rows in the table.
  • A minimal1 superkey is called "candidate key", or just "key".
  • A "natural key" is a key that is comprised from attributes that have intrinsic meaning and would be present in the data model even if there was no natural key on top of them.
  • A "surrogate key", on the other hand, has no intrinsic meaning and is introduced purely for technical reasons, as discussed below.

Properties of Surrogate Keys

If there is a natural key in the model, it cannot be replaced with a surrogate key. A surrogate key can only be created in addition to any existing natural keys.

So the engineering decision is not on surrogate versus natural key, but on surrogate + natural key versus the natural key alone.

Having a surrogate key + natural key:

  • May make the FOREIGN KEYs in child tables slimmer.2
  • The surrogate never needs to change, and therefore never incurs ON UPDATE CASCADE referential action.
  • May be more friendly to object-relational mapping (ORM) tools.

Having only a natural key:

  • Makes the parent table slimmer.3
  • May play better with clustering.4
  • May make JOINs unnecessary in some situations.5
  • May be needed for correctly modeling certain kinds of diamond-shaped dependencies. For example, the following model guarantees that if B and C are connected to the same D, they must also be connected to the same A:

    enter image description here

    Note how A_ID gets propagated from the "diamond" top, down both sides and then is merged at the bottom.

Typical Implementations of Surrogate Keys

Most commonly, a surrogate key is implemented as an auto-incremented integer. Examples:

  • Oracle supports the SEQUENCE object, that can be used either directly in INSERT statement, or through an ON INSERT trigger.
  • MS SQL Server has the IDENTITY data type, and from the 2012 version, the explicit SEQUENCE object as well.
  • PostgreSQL supports the explicit SEQUENCE object, as well as the serial types that use sequences implicitly.
  • MySQL has the AUTO_INCREMENT attribute.

GUIDs or UUIDs are sometimes used when uniqueness must be guaranteed without a central "generator" for surrogate key values, such as in certain "disconnected" or replication scenarios.


1 That is, a superkey that would stop being unique (and therefore, being a superkey) if any of the attributes were removed from it.

2 Surrogates tend to use "slimmer" data types such as integers, versus "fatter" types such as strings that are often used in natural keys. Also, while it is not unusual for a natural key to be composite, there is almost never a reason to make a composite surrogate key. As a consequence, a FOREIGN KEY referencing surrogate key tends to be slimmer than a FK referencing natural key.

3 There is no need for the additional index "underneath" the surrogate key. Each new index incurs a maintenance cost for INSERT/UPDATE/DELETE operations and may be especially costly in clustered tables, where secondary indexes must typically contain the copy of the clustering key (which is often same as primary key) and may incur a double-lookup during querying.

4 It is often necessary to query the data on a range that is "aligned" with the natural key. Using the natural key as a clustering key physically orders the data in the table, which can dramatically diminish the I/O under some circumstances. On the other hand, surrogate keys typically do not poses an ordering that would be useful for clustering.

5 We can fetch the migrated natural key directly from the child table, without JOINing with the parent.

110 questions
15
votes
10 answers

Relational database design question - Surrogate-key or Natural-key?

Which one is the best practice and Why? a) Type Table, Surrogate/Artificial Key Foreign key is from user.type to type.id: b) Type Table, Natural Key Foreign key is from user.type to type.typeName:
aryaxt
  • 76,198
  • 92
  • 293
  • 442
10
votes
7 answers

ID Best Practices for Databases

I was wondering what the best practices were for building and storing IDs. A few years ago, a professor told me about the dangers of a poorly constructed ID system, using the Social Security Number as an example. In particular, because SSNs do not…
8
votes
1 answer

Preferred way to add surrogate key to existing Oracle DB table

I have to modify an existing table in a Oracle 10g DB with a few thousand records to add a surrogate autonumber key. One way that comes to my mind is to Create a new sequence Create the id column, allowing null values Updating the id column with…
simon
  • 12,666
  • 26
  • 78
  • 113
7
votes
3 answers

web application user table primary key: surrogate key vs username vs email vs customer Id

I am trying to design an ecommerce web application in MySQL and I am having problems choosing the correct primary keys for the user table. the example given is just a sample example for illustration. user table have following definition CREATE …
infinityLoop
  • 325
  • 4
  • 8
6
votes
1 answer

nature key vs auto_increment key as the primary key

My problem is about nature key and auto_increment integer as primary key. For example, I have tables A and B and A_B_relation. A and B may be some object, and A_B_realtion record the many to many relation of A and B. Both A and B have their own…
NingLee
  • 1,477
  • 2
  • 17
  • 26
5
votes
1 answer

Nginx caching: tag-based cache-busting like Varnish Hashtwo

We're about to set up a cache and reverse proxy for our site, and we're deciding whether to use Varnish or Nginx. We have complex cache-busting requirements, and we effectively require surrogate key (or tag-based) cache invalidation. Varnish offers…
jdotjdot
  • 16,134
  • 13
  • 66
  • 118
5
votes
4 answers

Should primary key be auto_increment?

It is better to use primary key when design tables. But when designing a primary key, is it needed to set auto_increment? What's the benefit? I heard that can keep b-trees stable, but why? If table has a unique column, which is better: set the…
5
votes
6 answers

Should I create a surrogate key instead of a composite key?

Structure: Actor <=== ActorMovie ===> Movie ActorMovie: ActorID (fk), MovieId (fk)... ===> pk: (ActorID, MovieID) Should do I create a surrogate key for ActorMovie table like this? ActorMovie: ActorMovieID (pk), ActorID (fk), MovieId (fk)...
Johan Gosh
  • 131
  • 1
  • 7
4
votes
4 answers

Should I use "id" or "unique username"?

I am using PHP, AS3 and mysql. I have a website. A flash(as3) website. The flash website store the members' information in mysql database through php. In "members" table, i have "id" as the primary key and "username" as a unique field. Now my…
roa3
  • 901
  • 8
  • 15
  • 27
4
votes
4 answers

Surrogate Keys in Datawarehouse

I want to understand how surrogate keys are leveraged in real-time DWH environments. I get that they add the benefit of not being dependent on source-generated data to store each dimension key and also avoid having composite key built out of natural…
Abhi
  • 1,153
  • 1
  • 23
  • 38
4
votes
3 answers

creating dimension surrogate keys

I understand there are good reasons for using surrogate keys in data warehouse dimensions. Still, I do not understand how I can link them to my fact table's foreign keys. In the fact table I have only natural keys, extracted during ETL. Surrogate…
Davide
  • 41
  • 1
  • 4
4
votes
2 answers

Performance comparison of surrogate & composite key

If a database has attributes A1, A2, A3...An and A1, A2 & A3 can form composite key together, is it better to use a surrogate key instead of a composite key? Using a surrogate key will improve the Insertion execution speed of records (this supports…
Adithya Upadhya
  • 2,239
  • 20
  • 28
3
votes
1 answer

Avoiding Polymorphic Associations in Rails

(sorry for any bad English) Let's suppose I have models A, B, C. Each model have one address. In the book "SQL Antipatterns: Avoiding the Pitfalls of Database Programming" (chapter 7 - Polymorphic Associations) there is a recipe to avoid such…
3
votes
1 answer

How to populate fact table with Surrogate keys from dimensions

Could you please help understand how to populate fact table with Surrogate keys from dimensions. I have the following fact table and dimensions: ClaimFacts ContractDim_SK ClaimDim_SK AccountingDim_SK ClaimNbr ClaimAmount ContractDim ContractDim_SK…
Rachel
  • 208
  • 1
  • 5
  • 18
3
votes
2 answers

Surrogate key in 'User' / 'Role' tables for desktop app? Whats the purpose?

I have to add some security for a C#/.NET WinForms/Desktop application. I am using Oracle DB back-end. The tables are simple: User (ID,Name), Role(ID,Role), UserRole(UserID,RoleID). I am using the windows account name to populate User table. Role…
dferraro
  • 6,357
  • 11
  • 46
  • 69
1
2 3 4 5 6 7 8