Are indexes basically a copy of the tables?

Question

today I went to a job interview and while I was there I heard that "Indexes are bascially a clones of the tables, on which they're made".

Could someone relate to this statement? Honestly I've never heard this kind of Index definition

There are many discussions and sharing already, https://stackoverflow.com/questions/1108/how-does-database-indexing-work — Jacob, Jul 09 '18 at 10:30
This is mostly false, even if we're talking about a clustered index -- it's more accurate to say the clustered index *is* the table, not a clone of it. And while non-clustered indexes do contain copies of data in the table (which is why covering indexes are a thing) their structure differs from the base table. "Clone" is a misleading term, but it is worth keeping in mind that indexes redundantly copy data (for better and worse). — Jeroen Mostert, Jul 09 '18 at 10:37
There are ~12 different types of indexes in SQL Server https://learn.microsoft.com/en-us/sql/relational-databases/indexes/indexes?view=sql-server-2017. If they interviewer meant a specific type of index- they should have specified. Even then, plenty of people here will disagree with that statement. — Zorkolot, Jul 09 '18 at 18:51

score 2 · Accepted Answer · answered Jul 09 '18 at 10:35

2

Not really, although they could be.

Every index (including the clustered index) will be using the index keys in all of its internal nodes. What's different is what happens when we reach the leaves of the index.

In a normal, old-school non-clustered index in SQL Server, what you'll find in the leaves are the key values for the clustered index (or some form of row ID for heap tables). Whereas in the clustered index, you'll find the values for all columns, not just those which are the clustered keys and (for that index) it's specific keys.

INCLUDE in indexes muddies the water somewhat by including extra columns at the leaf level in non-clustered indexes.

If the total set of columns in (index keys, clustered-index keys, included columns) for a non-clustered index is the same as the set of all columns in the table, then to an extent the non-clustered index does seem to be a copy of the table - at least to the extent that any query making use of this index will not have to perform any table-lookups to retrieve all data.

If the set of columns above isn't the same as the set of all columns in the table then it's not a copy of the table. It's a copy of a subset of columns of the table. Of course, if this subset of columns are all of the columns required by a particular query then a table lookup can still be avoided.

answered Jul 09 '18 at 10:35

Damien_The_Unbeliever

234,701
27
340
448

I came up with another question: Does non-clustered index contain any data in the leaves or just clustered index key values/RIDs for the heap - to the data? – Rocket128 Oct 05 '18 at 07:51
1

@Rocket128 - it will include the non-clustered index keys, the lookup values (clustered keys/RID) and any `INCLUDE` columns. – Damien_The_Unbeliever Oct 05 '18 at 07:53
Therefore the basic non-clustered index with no included columns will not contain any data. There'll be only clustered keys/RID to the actual data in the base table. Am I right? – Rocket128 Oct 05 '18 at 08:01
1

@Rocket128 - bear in mind that the leaves still (potentially) cover multiple distinct values. That's why I said the non-clustered index keys are in there too. – Damien_The_Unbeliever Oct 05 '18 at 08:03
What does non clustered index keys basically do? I mean the index keys are some kind of pointers to the rows in the table but that's what Clustered index keys are so what's about non-clustered index keys? – Rocket128 Oct 05 '18 at 08:23
1

@Rocket128 - the actual keys for this index. If this index is on `a,b` and the clustered index is on `c,d` then at the leaf, we'll have one or more entries that say "For *these* `a,b` values, lookup these rows" followed by a set of `c,d` pairs. – Damien_The_Unbeliever Oct 05 '18 at 08:25
So if i made nonclustered index on columns 'a,b' that index will not contain those columns data, but it will have non-clustered index keys for those 2 columns, which will help in searching values from 'a,b'? – Rocket128 Oct 05 '18 at 08:46

score 1 · Answer 2 · answered Jul 09 '18 at 10:34

If you spoke about a clustered index then it's true. Just check documentation:

Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be stored in only one order.

The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.

But if you spoke about non-clustered index then it's false coz table store as a heap and index separate from table. In this case index is another object which looks like a data structure.

Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and each key value entry has a pointer to the data row that contains the key value.

The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key.

You can add nonkey columns to the leaf level of the nonclustered index to by-pass existing index key limits, and execute fully covered, indexed, queries. For more information, see Create Indexes with Included Columns. For details about index key limits see Maximum Capacity Specifications for SQL Server.

Are indexes basically a copy of the tables?

2 Answers2