5

I have some doubts on choosing the right index and have some questions:

Clustered index

What is the best candidate?

Usually is the primary key but if the primary key is not used in the search by eg CustomerNo is used to search on customers should the clustered index put on CustomerNo?

Views with SchemaBinding

If have a view with indexes I read that these are not used but those on tables are.

Pointless no? Or am I missing the point? Will it make a difference using "NOExpand" to force to read the index from the view rather than the table?

Nonclustered indexes

Is it good practice when adding a nonclustered index to include every possible column till you reach the limit?

Many thanks for your time. I am reading massive database and speed is a must

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
user9969
  • 15,632
  • 39
  • 107
  • 175
  • 1
    frankly nobody has answered your question .If CustomerNo can be PK and this column is most frequently use in search and this column is FK in most table then make it PK. ii) Other column who are not PK but use in SEARCH can be made nonclustered index.OTHER non pk which are not use in search condition but use in select query can be included in non clustered index . – KumarHarsh Feb 11 '14 at 09:33

4 Answers4

9

The clustered index is the index that (a) defines the storage layout of your table (the table data is physically sorted by the clustering key), and (b) is used as the "row locator" in every single nonclustered index on that table.

Therefore, the clustered index should be

  • narrow (4 byte is ideal, 8 byte OK - anything else is too much)
  • unique (if you don't use a unique clustered index, SQL Server will add a 4 byte uniqueifier to your table)
  • static (shouldn't change)
  • optimally it should be ever-increasing
  • fixed with - e.g. don't use large Varchar(x) columns in your clustered index

Out of these requirements, the INT IDENTITY seems to be the most logical, most obvious choice. Don't use variable length columns, don't use multiple columns (if ever possible), don't use GUID (that's a horribly bad choice because of it's size and randomness)

For more background info on clustering keys and clustered indexes - read everything that Kimberly Tripp ever publishes! She's the Queen of Indexing in SQL Server - she knows her stuff extremely well!

See e.g. these blog posts:

In general: don't overindex! too many indices is often worse than none!

For non-clustered indexes: I would typically index foreign key columns - those indexes help with JOINs and other operations and make things faster.

Other than that: don't put too many indexes in your database ! Every index must be maintained on every CRUD operation on your table! This is overhead - don't excessively index!

An index with all columns of a table is an especially bad idea since it really cannot be used for much - but carries a lot of administrative overhead.

Run your app, profile it - see which operations are slow, try to optimize those by adding a few selective indexes to your table.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
2

Clustered Indexes

Just to add to marc_s good answer, one exception to the standard INT IDENTITY PK approach to Clustered Indexes is when you have Parent Child tables, where all the children are frequently always retrieved at the same time as the parent. In this case, clustering by Child table by the Parent PK will reduce the number of pages read when the children are retrieved. For example:

CREATE TABLE Invoice
(
   -- Use the default MS Approach on the parent, viz Clustered by Surrogate PK 
   InvoiceID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
   -- Index Fields here
);


CREATE TABLE InvoiceLineItem
(
   -- Own Surrogate Key
   InvoiceLineItemID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
   InvoiceID INT NOT NULL FOREIGN KEY REFERENCES Invoice(InvoiceID),
   -- Line Item Fields Here
);

-- But Cluster on the Parent FK
CREATE CLUSTERED INDEX CL_InvoiceLineItem ON InvoiceLineItem(InvoiceID);

NonClustered Indexes

No, never just include columns without careful thought - the index tree needs to be as narrow as possible. The ordering of the index columns is critical, and always ensure that the index is designed with selectivity of the data in mind - you will need to have a good understanding of the distribution of your data in order to choose optimal indexes.

You can consider using covering indexes to include (at most, a few) columns which would otherwise have required a bookmark lookup from the Nonclustered index back into the table when tuning performance-critical queries.

Community
  • 1
  • 1
StuartLC
  • 104,537
  • 17
  • 209
  • 285
1

As a very basic rule of thumb I use, is to use nonclustered indexes when small amounts of data will be returned and clustered indexes when larger resultsets will be returned by your query.

I recomend you read Clustered Index Design Guidelines

As for indexing views: indexing views works the same as indexing the table. It can improve preformance but like indexing tables it can also slow things down.

I recomend you read Improving Performance with SQL Server 2008 Indexed Views


In genral when indexing i find less is better. You need to research your data not just slap indexes on everthing. Check what you are linking on, add indexes and check the Execution plan. Sometimes what you think would make a good index actualy can make thing slower.

Linda Lawton - DaImTo
  • 106,405
  • 32
  • 180
  • 449
0

Views with SchemaBinding

...

Pointless no? Or am I missing the point?

(More properly, indexed views, schemabinding is a means to an end here, and the rest of the text is more talking about indexed views)

There can be (at least) two reasons for creating an indexed view. Without seeing your database, it's impossible to tell which of those reasons apply.

The first is to compute intermediate results which are expensive to compute from the base table. In order to benefit from that computation, you need to ensure your query uses the indexes. To use the indexes you either need to be querying the view and specifying NOEXPAND, or be using Enterprise or Developer edition (On Ent/Dev editions the index might be used even if the base table is queried and the view isn't mentioned)

The second reason is to enforce a constraint that isn't enforceable in a simpler manner, by implementing e.g. a unique constraint on the view, this may be enforcing some form of conditional uniqueness on the base table.

An example of the second - say you want table T to be able to contain multiple rows with the same U value - but of those rows, only one may be marked as the Default. Before filtered indexes were available, this was commonly achieved as:

CREATE VIEW DRI_T_OneDefault
WITH SCHEMABINDING
AS
    SELECT U
    FROM S.T
    WHERE Default = 1
GO
CREATE UNIQUE CLUSTERED INDEX IX_DRI_T_OneDefault on DRI_T_OneDefault (U)

The point is that these indexes enforce a constraint. It doesn't matter (in such a case) whether any query every actually uses the index. In the same way that any unique constraint may be declared on a base table but never actually used in any queries.

Community
  • 1
  • 1
Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448