3

I've seen various articles which point to using a pattern of:

  • ID (uniqueidentifier) : PK, non-clustered
  • ClusteredID (int) : INDEX, clustered (IDENTITY)
  • My other columns (various)

My interpretation of this pattern is that it allows a record to be identified using a uniqueidentifier (which is useful in many situations), but means that the clustered index does not get so fragmented because of the incremental ordering of generated int identity column values.

This all makes sense to me so far.

What I'm not fully understanding is that:

If the unique identifier is the uniqueidentifier, then it makes sense for me to use this as the FK in related tables. Now, when executing a select statement joining the two tables SQL Server will perform the join on the uniqueidentifiers. However, seeing as it is actually the int index which is clustered, then I'd have thought that clustering on the int would not help performance with joins.

If my assumptions above are correct, then I'm not appreciating the benefit of adding the clustered int index.

I guess, therefore, that my assumptions are not correct. Could anybody tell me where I'm going wrong in my thinking?

Giorgi Nakeuri
  • 35,155
  • 8
  • 47
  • 75
Neil W
  • 7,670
  • 3
  • 28
  • 41
  • 1
    A good description of the topic can be found [here](http://stackoverflow.com/questions/11938044/what-are-the-best-practices-for-using-a-guid-as-a-primary-key-specifically-rega) - What are best practices for using a GUID as a primary key ... – Bogdan Bogdanov Nov 10 '13 at 12:54
  • 1
    A good explanation on your question is [GUID as Primary / Foreign key dilemma](http://stackoverflow.com/questions/2996031/guid-primary-foreign-key-dilemma-sql-server) – Bogdan Bogdanov Nov 10 '13 at 13:02
  • Thanks for both of those links, Bogdan. Very helpful. – Neil W Nov 10 '13 at 13:09
  • 1
    As usual: **it depends**. First of all: why not reference the `ClusteredD` from your child tables? You can totally do this - since it's an `INT IDENTITY` column, you can easily put a `UNIQUE CLUSTERED INDEX` on it and thus use it for FK constraints. Secondly: since there **IS** an index on the `ID` column, the joins will be as efficient as they can be. I don't see any downside here... Yes, the `ID` isn't determining the physical table structure - but the **index** on `ID` still provides an ordering (of just the index columns, of course), which is what helps with JOINs – marc_s Nov 10 '13 at 13:10
  • Thanks @marc_s ... I did some more digging on the relationships between Primary Key, Clustered Index, and other Indexes. I hadn't realised that for each index there is a separate 'hidden' table created linking the index and clustered index to enable efficient retrieval. Thanks for your comments here and your detailed explanation in the links that Bogdan provided. – Neil W Nov 10 '13 at 14:48
  • 2
    It's not a "hidden" table - every nonclustered index just also contains the **clustering key** at the leaf level. So when you've found an entry in your non-clustered index, you also always have its clustering key available to get to the actual data page(s). – marc_s Nov 10 '13 at 14:53
  • 1
    @marc_s - that last small comment of yours is worth is weight in gold! Thanks. – Dale K Apr 14 '15 at 22:43

0 Answers0