1

This question concerns to Oracle DB, so if there are general answers I would like to know. As I am discarding information from Derby/MySQL and other DBs regarding this subject.

Let's say I have several queries using the following columns on its WHERE clause:

Column | Cardinality | Selectivity
_______|__________________________
A      | low         | low
B      | high        | low
C      | low         | low
D      | high        | high
E      | low         | low
F      | low         | low
-- Queries
SELECT * FROM T WHERE A=? AND B=? 
SELECT * FROM T WHERE A=? AND B=? AND C=?
SELECT * FROM T WHERE A=? AND C=?
SELECT * FROM T WHERE A=? AND C=? AND D=?
SELECT * FROM T WHERE A=? AND E=? AND F=?
  1. Is there any benefit from pairing these columns (taking into account cardinality mixing) as composite indexes? If so, what is the logic to follow? I have understood this explanation but it is for SQL Server and it may behave differently.

  2. Is it worthwhile to do covering indexes instead of individual small composite indexes?

  3. Does it matter the column order of composite indexes? i.e:

-- Regardless the column order on the table creation.
CREATE INDEX NDX_1 ON T (A, C);
-- Versus:
CREATE INDEX NDX_1 ON T (C, A);
  1. Would this index be useful?
CREATE INDEX NDX_2 ON T(E, F); -- (low + low) Ignoring 'A' column.
Rick James
  • 135,179
  • 13
  • 127
  • 222
another
  • 3,440
  • 4
  • 27
  • 34

2 Answers2

2

A few things and bear in mind these are generalities

  • Generally you can only use the leading parts of an index. So looking at your examples

    If you have an index on ( A, B, C ) and you have a predicate on A and C, then only the index on A can be used. Now there are some cases where the non-leading part of an index can be used; you will see this in an execution plan as a SKIP-SCAN operation, but they are often sub-optimal. So you may want to have (A, C) and ( C, A )

  • A covering index can be useful, if you are not projecting columns other than those in the index.

  • Again generally, you do not usually want or need an index if the column has low selectivity. However, it's possible that you have two columns that individually have low selectivity, but have high selectivity when used in combination. (In fact, this is the premise of a bitmap index / star transformation in a dimensional model).

  • If a multi-column index is useful you may want to put the column with the lowest selectivity first and enable index compression. Index compression can save a huge amount of space in some cases and has very little CPU overhead.

Finally, a SQL Monitor report will help you optimizing a sql statement when it comes to running it.

Jon Heller
  • 34,999
  • 6
  • 74
  • 132
BobC
  • 4,208
  • 1
  • 12
  • 15
  • Is it a good practice to have 2 separate indexes (A, C) and (C, A) altogether? even if both are low on cardinality? Thank you. – another Jan 18 '17 at 22:50
  • 1
    @Roizpi. I tend not to be a big fan of "best practices"; but rather I would try and give you the information so that you can make an informed decision. The reason is that a lot of the answers are simply - "It Depends". However if both A and C are both low *selectivity*, I would probably not create both indexes. – BobC Jan 18 '17 at 23:03
1

The minimum number indexes to optimally handle all 5 cases:

(A, B, C)  -- in exactly this order
(A, C, D)  -- in exactly this order
(A, E, F)  -- in any order

If you add another SELECT, all bets are off.

When to have (A, C) and (C, A)?...

  • Each handles the case where only the first column is being used.
  • The former is optimal for WHERE A=1 AND C>5; the latter is not. (Etc) Note: = versus some kind of "range" test matters.

When designing indexes for a table, first write out all the queries.

More discussion: Higher cardinality column first in an index when involving a range?

Rick James
  • 135,179
  • 13
  • 127
  • 222