What is the impact of creating a table with a unique index but no primary key?

Question

What is the best way to make a simple many-to-many cross reference table which contains nothing but two columns which are themselves primary keys in other tables?

Does anyone have concrete evidence for or against creating a table with a single unique index, but no primary key? (Alternatives are detailed below).

Put another way: How does SQL Server internally uniquely identifies rows a) that have a primary key and b) that do not have a primary key?

In detail:

Given the input tables:

CREATE TABLE Foo ( FooID bigint identity(1,1) not null primary key, other stuff... )
CREATE TABLE Bar ( BarID bigint identity(1,1) not null primary key, other stuff... )

The three basic options are (in all cases assume a foreign key is created on the FooID and BarID columns):

-- Option 1: Compound primary key
CREATE TABLE FooBarXRef ( 
    FooID bigint not null
  , BarID bigint not null
  , PRIMARY KEY ( FooID, BarID )
  , CONSTRAINT FK... etc
)

-- Option 2: Independent primary key + unique index
CREATE TABLE FooBarXRef ( 
    FooBarXRefID bigint identity(1,1) not null primary key
  , FooID bigint not null
  , BarID bigint not null
  , CONSTRAINT FK... etc
);
CREATE UNIQUE INDEX I_FooBarXRef_FooBar ON FooBarXRef ( FooID, BarID );

-- Option 3: Unique index, no explicit primary key:
CREATE TABLE FooBarXRef ( 
    FooID bigint not null
  , BarID bigint not null
  , CONSTRAINT FK... etc
);
CREATE UNIQUE INDEX I_FooBarXRef_FooBar ON FooBarXRef ( FooID, BarID );

Does having a separate identity PK on the xref table to be redundant; that may needlessly introduces another layer of constraint checking on the database engine?

On the other hand are multi-column primary keys problematic? With a proposed solution to have the xref table contain only the two foreign keys, and define a unique index on those columns, but not define a primary key at all... ?

I suspect that doing so will cause SQL Server to create an internal primary key for the purposes of uniquely identifying each row, thus yielding the same redundant constraints as if a primary key were defined explicitly--but I have no proof or documentation to support this. Other questions and answers suggest that there is not an internal primary key by default (i.e. no equivalent to the Oracle ROWID); as the %%physloc%% is an indicator of where a row is currently stored and thus is subject to change. My intuition is that the engine must create something to uniquely identify a row in order to implement cursors, transactions, and concurrency.

I would certainly go for option1. It's a relationship table, so the composite PK makes perfectly sense in this case. The generic assertions like "composite keys are evil" are... evil ;) — Raphaël Althaus, Feb 13 '14 at 18:07
@RaphaëlAlthaus I agree on both counts...but I'm looking for *evidence* that alternatives 2 and 3 are demonstrably less optimal :) — Joshua Honig, Feb 13 '14 at 18:09

deeg · Accepted Answer · 2014-02-13T19:24:36.383

The concept of a primary key is really about relational theory; maintaining referential integrity by building relationships across multiple tables. The SQL Server engine, by default, creates a unique clustered index when a primary key is built (assuming a clustered index doesn't exist at the moment).

It's this clustered index that defines a unique row at the leaf level. For tables that have a non-unique clustered index, SQL Server creates a 4byte "uniquifier" to to the end of your key.

TestTable1 Primary Key
TestTable2 Primary Key & Unique Non-Clustered
TestTable3 Unique Clustered
TestTable4 Primary Clustered (same as Table1 & Table3, since a primary key CAN be defined on a non-clustered index I prefer this to always define which structure I want).

TestTable2 is redundant, it's create a unique clustered index to store all the records at it's leaf level. It's then creating a unique non-clustered index to enforce uniqueness once again. Any changes on the table will hit the clustered and then the non-cluster.

TestTable1, TestTable3, TestTable4 are a tie in my book, a unique clustered index structure is created on all. There is no physical difference in the way records are stored on a page.

However for SQL Server Replication, all replicated tables required a primary key. If your'll be using Replication in the future you may want to make sure all your unique clustered indexes are primary keys as well.

I seem to be unable to paste in my verifying scripts, so here they are on hastebin.

http://hastebin.com/qucajimixi.vbs

Link is broken, how about a [SqlFiddle](http://sqlfiddle.com/) session example? — ΩmegaMan, Nov 22 '21 at 17:36

score 2 · Answer 2 · answered Feb 13 '14 at 18:39

Well, it all depends on the requirement. As far as I know

PRIMARY KEY= UNIQUE KEY+NOT NULL key

What this tells you is that you can have multiple

NOT NULL UNIQUE INDEXES(NON CLUSTERED)
      but

 CANNOT HAVE MULTIPLE PRIMARY KEYS IN A TABLE( CLUSTERED).

I am a huge believer of Relational database model and working with the PRIMARY-FOREIGN KEYS relationships. DB replication requires you to have Primary Key on a table ; therefore, it is always a good practice to create Primary Key instead of UNIQUE keys for your table.

What is the impact of creating a table with a unique index but no primary key?

2 Answers2