0

I have a table MyTable with several columns with the following data types:

Date datetimeoffset
Column0 varchar(20)
Column1 float
Column2 float
Column3 int

I run the following query:

SELECT Date, Column1
FROM
dbo.MyTable
WHERE Date BETWEEN '2023-06-01 00:00:00' AND '2023-06-10 00:00:00'
AND Column0 = 'value0'

I would like to speed up the query. I heard about some indices. Should I change the type of Column0 to uniqueidentifier to speed up the query instead varchar? Initially, I thought varchar is smaller than uniqueidentifier. In my table Column0 has 6 different values.

I assume that if instead of varchar in the table an uniqueidentifier is introduced and when executing the mentioned query an ancillary table with my six values used via JOIN, will it be faster?

What if there are several similar tables with common columns, e.g. Date and Column0?

  • 1
    No. Only index on date column will speed up – SQLpro Jun 13 '23 at 13:48
  • Not only is it unlikely that would make the query go faster, it will probably slow it down. – Zohar Peled Jun 13 '23 at 13:48
  • what is in the varchar(20) column? uniqueidentifier is GUID and the string representation of those is >20 chars. So not obvious that this conversion is even possible – Martin Smith Jun 13 '23 at 13:50
  • 1
    But anyway you need a composite index on `(Column0 ,Date) INCLUDE (Column1)` for this specific query. And if you only have 6 values just use a tinyint and store the actual strings in a separate lookup table – Martin Smith Jun 13 '23 at 13:52
  • @MartinSmith Why don't you put that as an answer – Charlieface Jun 13 '23 at 14:00
  • 1
    Also, [What do BETWEEN and the devil have in common?](https://sqlblog.org/2011/10/19/what-do-between-and-the-devil-have-in-common) – Zohar Peled Jun 13 '23 at 14:01
  • 1
    if you have very little variance (few different unique values) on Column0, it might be better to have an index on Date, Column0 – siggemannen Jun 13 '23 at 14:04
  • Yes a GUID can be represented as varchar(20) if you use Base85. Uniqueidentifier only uses 16 bytes, so it should be a little faster. – Stig Jun 13 '23 at 14:15
  • btw I have several tables with Date and Column0. How to create and index on these two columns for all tables within my db? `CREATE INDEX MyIndex ON schema (Date, Column0);` – Arseniy Sleptsov Jun 13 '23 at 15:48
  • @siggemannen should it be non-clustered? do I need to create for all my tables? these two columns are in each table – Arseniy Sleptsov Jun 13 '23 at 15:52
  • 1
    Usually you already have a clustered index. I'm not sure it's possible to answer exactly if ALL of your tables needs indexes :) You asked about one query and people answered that index on Column0, Date include (column1) OR Date, Column0 Include(column1) might help, i would say both suggestions were thinking non-clustered. But you're asking questions which suggest you didn't read the details of the replies, so please do that first – siggemannen Jun 13 '23 at 15:58

2 Answers2

1

You have given us exactly one query to look at.

For that single query the unambiguously correct index to add is

(Column0 ,Date) INCLUDE (Column1) 

(or just (Column0 ,Date) if it is clustered)

Selectivity does not enter into it.

Equality columns need to go before range columns and the index "covers" the query by including Column1 so does not need to do any key lookups.

With the above index it is able to seek into the composite key ('value0', '2023-06-01 00:00:00') and read the rows in order until it gets to a key > ('value0', '2023-06-10 00:00:00') so the index seek will read exactly the minimal rows required for the query's WHERE clause.

If you have an index with leading column Date the best it can do is a range seek on that and a residual predicate to ignore rows read where Column0 is not 'value0'.

Regarding changing the column datatype to uniqueidentifier. The index seek is already pretty efficient. It may have a minor beneficial effect though due to a combination of smaller size and simpler comparison semantics than strings. But instead of storing the same varchar(20) strings repeatedly this seems an obvious candidate for storing them in a lookup table anyway though and then you could replace with a tinyint for much smaller size.

However typically a database workload does not just consist of one SELECT query. So you need to pick an indexing strategy that works well for your overall workload (including effect on writes).

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
0

The query speed is not expected to be improved by changing the data type from varchar to uniqueidentifier. Instead, think about constructing indexes on the columns that are used for filtering and joining, for as a non-clustered index on the Date column for date range filters. If Column0's selectivity is low, an index might not be helpful, but if it is very selective and regularly filtered, you might think about making one. It's doubtful that a JOIN with an auxiliary table will result in considerable benefits. For better query performance, put your attention on index optimization.

Venkat
  • 549
  • 6
  • Do I need to create a non-clustered index on Date and Column0? – Arseniy Sleptsov Jun 13 '23 at 15:44
  • For improved query performance, construct a non-clustered index on the Date field. Only if it is often filtered and very selective should you think about putting an index on Column 0. – Venkat Jun 13 '23 at 16:22