Does composite key comprised of many keys impact the performance?

Question

I have a situation where a composite key is comprised of 5 or more keys as below.

Option 1:

Test table:
schoolid
testid  --composite Primary key on both the columns

TestStudent table:
schoolId
testId
studentId  -- composite primary key of all columns

TestStudentMCQ table:
schoolid
testid
studentid
choiceid  --composite Primary key of all columns

I can make the Test table's PK as TestId so that the MCQAnswer table will have TestId, StudentId, QuestionId and ChoiceId

If the performance impact is not significant I prefer to go with the first option.

Further I use the RLS along with the SchoolId.

I also wanted to mention here:

I also have another table TestStudent, which has the composite primary key as shown above. The same thing can be designed as follows.

Option 2:

Test table:
schoolid
testid  --composite Primary key on both the columns

TestStudent table:
id
schoolId
testId
studentId  -- id is the identity primary key here

TestStudentMCQ table:
testStudentId
choiceid  --composite Primary key of both columns

Which approach is better?

Pieter Geerkens · Answer 1 · 2016-07-15T03:14:01.007

The designation primary key originates in the pre-relational days of ISAM and other database types, and was unfortunately co-opted by Codd, Date, et al into relational terminology as part of the logical design. I say unfortunately because the large relational vendors then co-opted it back into the physical design as the specification of the performantly optimal foreign key lookup mechanism - a physical design attribute.

In consequence of this confusion, I avoid the term primary key whenever possible. There are Natural (or Business) Keys in the logical design, and Surrogate Keys in the physical design. The decision of whether or not to use a Surrogate Key on a particular table is a thorny one, but one can say one thing: if a table is known to never have more than about 50 to 200 rows the engine will probably never use an indexed lookup, and so a Surrogate Key is unnecessary. Needless to say, all Keys whether Natural or Surrogate should be enforced by Unique Indices, for both logical and physical design reasons.

However, it is a tenet of being a relational table, which is to say be in First Normal Form, that every table have a Natural (or Business) Key. As Surrogate Keys should never be exposed to users, such a Natural Business Key becomes the only means for the user to specify individual rows.

When one has chosen to implement a Surrogate Key then the index used to enforce that Key is nearly always best made the Clustered Index.

To address your specific concern: Yes, a clustered index on a large composite key will definitely impact performance of a table containing more than 100 or 200 rows. Enforce your (composite) Natural Key with a non-clustered index, and possibly add a few well chosen additional coverage fields to that index. Make the index enforcing your Surrogate Key the clustered index unless you have very good and well documented reason to do otherwise.

Reason is that:

Index height varies inversely with key width. A wide composite index can have one or two (or more!) levels less than a narrower index, requiring both more soft and hard page faults and more cache usage, which impact performance not only of lookups for the table itself but everything serviced by that engine.
Every lookup through a non-clustered index, unless covered by that index, then requires a further lookup into the clustered index via the clustered index key. It is really important to make the clustered index lookups efficient because they control the efficiency of all lookups.

Finally, remember to document all clustering and index decisions with their impact and rationale in both the logical and the physical design. Physical design decisions will regularly be under scrutiny by the DBA's for ways and means of tuning performance - but decisions made strictly in the logical design must never be changed for performance reasons. Help your DBA's out and clearly mark which are which.

score 1 · Answer 2 · answered Jul 15 '16 at 03:15

Does composite key comprised of many keys impact the performance?

well yes it do impact the performance if you are referring on how the clustered index will be created on those primary keys as the keys you use in the clustered index will be the sorting pattern it will follow, also if you create Non-Clustered index, it will use the Keys you used in the Clustered index as a bookmark to reference the data rows that is why it should unique,short and static. to prevent performance issue,

but if you are talking about Primary keys only, then no, it will only dictate the uniqueness of your rows based on composite primary key you have used.

score -1 · Answer 3 · answered Jul 14 '16 at 09:48

Clustered index from Primary key is copied to all non-clustered indexes, increasing theirs' size. For small and medium sized tables this could be not noticeably at all, but on large tables you want to fight for a single byte to reduce table size as much as possible (especially for Data Warehouses' facts table).

So, for small and medium sized tables you may want to make business key as primary key - in this case, composite for all columns - with this approach, connection tables and relation columns becomes much more readable. For large tables, you may want to make surrogate auto increment integer as primary key and create UNIQUE constraint on business key (composite in your case).

score -1 · Answer 4 · answered Jul 14 '16 at 10:32

-1

We should make a distinction between a primary key constraint and a clustered index. The SQL Server by default creates a CI to support a PK constraint, but it could be overriden.

Primary key is unique and idenifies a row (in your case, if testid is unique by itself, it would be a mistake to add any other column to it). Clustered index doesn't need to be unique and affects the physical layout of data rows in a table.

Primary key is also used in FKs if a table is referenced by other tables. If this a case, it's more convenient if it's single column.

For performance reasons, clustered index should be narrow, ever-increasing and immutable. Narrow, because every NCI will contain the whole CI as a pointer to the row data. Increasing to prevent page splits and fragmentation. Immutable, because any change to it will cause all NCIs to be updated.

Tl;dr, for primary key I'd use a business key if there's no reason against. For a CI, I'd consider using a sequence first.

answered Jul 14 '16 at 10:32

dean

9,960
2
25
26

FKs can also target other keys in the table (either unique constraints or just unique indexes). – Damien_The_Unbeliever Jul 14 '16 at 13:52
@Damien_The_Unbeliever Of course, but I don't see relevance here, given the OPs question – dean Jul 14 '16 at 13:57
It was the implication of this sentence - "Primary key is also used in FKs if a table is referenced by other tables". It implies that every FK will have a copy of the PK, but that is not necessarily so. – Damien_The_Unbeliever Jul 14 '16 at 13:58
@Damien_The_Unbeliever OK, fair point, but I don't think I'm going to edit, though; there's always one more thing. Eg, not every unique index can be used as a FK target. And so on. – dean Jul 14 '16 at 14:06

Does composite key comprised of many keys impact the performance?

4 Answers4

Linked