What's the effect of including an "include" column in a non-clustered index that's already part of the clustering key?

Question

Suppose I cluster a table on (RetailerID, PurchaseDate, UserID). That's the "clustering key", and clustering keys are always included in all non-clustered indexes. https://stackoverflow.com/a/23057196/88409 https://stackoverflow.com/a/2747869/88409

Next, I create a non-clustered index "StorePurchasesIndex" keyed on (RetailerID, StoreID, PurchaseDate), to make lookups that include just a subset of specific stores faster.

The first question is, do I need to explicitly include UserID as an included column, or will it be there implicitly by virtue of the clustering key including it? I'm pretty sure I do NOT need to include UserID explicitly in this case, but correct me if I'm wrong.

What I'm really interested in knowing is what happens if I do explicitly include UserID as an include column. Will it be included in the index redundantly, once as part of the clustering key, and again as an included column? Or does SQL Server recognize the intent and avoid storing it twice since it's already included by virtue of the clustering key?

The second question is, if it's not included redundant, then is there a benefit to including it explicitly. For example, will it ensure UserID is included in the future, even if the clustering key changes in such a way that it excludes UserID and the index is rebuilt?

The clustering key column(s) is (are) included at the **leaf level** in every nonclustered index, which means the data is there, once the nonclustered index value has been found - but it's not in the index tree, so it cannot be used for searching/filtering. And SQL Server is clever enough to *avoid* adding a column again that's already part of the index leaf level - so in your case, adding the clustering key column is just pointless and superfluous — marc_s, Aug 04 '16 at 19:35
Yes, but defining the index with the included column ensure that it stays in the index even if it disappears from the clustering key, right? — Triynko, Aug 04 '16 at 21:56

score 1 · Answer 1 · edited Aug 04 '16 at 20:30

For a nonclustered index, index key/keys are present at the root and leaf level by default..

This default case varies by the definition of your nonclustered index..

When you create a non-unique clustered index, SQL Server will add clustered key at root to make it unique and it will be also present in leaf levels as well..

When you create a unique clustered index, SQL Server will not include clustered key at the root level, but will include at leaf level..

So coming to your questions..

The first question is, do I need to explicitly include UserID as an included column, or will it be there implicitly by virtue of the clustering key including it?

Yes you are right, you don't need to add userid in include list..

What I'm really interested in knowing is what happens if I do explicitly include UserID as an include column. Will it be included in the index redundantly, once as part of the clustering key, and again as an included column? Or does SQL Server recognize the intent and avoid storing it twice since it's already included by virtue of the clustering key?

SQL Server is smart enough to ignore this..

The second question is, if it's not included redundant, then is there a benefit to including it explicitly.

Even if you add SQL Server will ignore the column

For example, will it ensure UserID is included in the future, even if the clustering key changes in such a way that it excludes UserID and the index is rebuilt?

You can't alter clustered key definition,you have to drop and recreate it,so when clustered key changes,the non clustered index is rebuilt as per its definition, so userid will be present

What's the effect of including an "include" column in a non-clustered index that's already part of the clustering key?

1 Answers1