1

I try to optimize index usage in some of the queries I use often. Recently I have learnt that using function and implicit conversions in where clause is not optimal. To my surprise implicit conversion can be faster due to proper index usage.

I have a table called Records. Clustered index & primary key is on Id column (int) and non-clustered index on column Created (datetime). To avoid implicit conversion I created @dd variable.

declare @dd Datetime

set @dd = '2019-08-25'

--1st option
select  * from Records where Created > @dd
--2nd option
select  * from Records where Created > '2019-08-25'    
--3rd option
select  * from Records where Year(Created) = 2019 and MONTH(Created) = 8 and DAY(Created) = 25

Unfortunately 1st option uses Index scan (column Id). 2nd option uses index seek (column Created) and key lookup which is nice but I wonder why 1st option doesn't do the same. I've added 3rd option just to see the difference and it behaves as 1st option.

I've found query execution plan : missing index which leads to blog post about this behavior but it doesn't explain why it happens. I can see there is a difference in estimated number of rows. When I set date to '2019-06-25', all three option are using similar plan with index scan.

So is it a rule of thumb to use implicit conversion when I can expect number of rows will be low? I am quite confused here.

Bendom
  • 175
  • 1
  • 14
  • Parameter vs specific value. The first one is more general. You should provide table structure and sample data. Are the statistics up to date? Do you really need all columns `SELECT *` is antipattern.Depending on data distribution query optimizer could skip using index(here created) and start using PK(it has to get all columns so there is no point of using index and then do additional key lookup). The 3rd option is not [SARGable](https://stackoverflow.com/questions/799584/what-makes-a-sql-statement-sargable#799616) – Lukasz Szozda Aug 25 '19 at 10:52
  • 1
    @Bendon . . . You do not need to worry about implicit conversion of a string to a date/time. The compiler does the conversion, so the optimizer sees a date/time constant. – Gordon Linoff Aug 25 '19 at 11:34
  • Adding to @GordonLinoff's comment, implicit conversion of datetime literals is required because T-SQL doesn't have syntax for a datetime literal. This is not a performance concern, Implicit conversion is a concern in cases where it results in a non-sargable expression due to data type precedence (e.g. comparing an int literal with a varchar column) and when applying functions like your last query. – Dan Guzman Aug 25 '19 at 11:38
  • The issue in your queries is that you are using a variable instead of parameter. You should get the same performance as the literal with `select * from Records where Created > @dd OPTION(RECOMPILE)` or with parameterized query `EXEC sp_executesql N'select * from Records where Created > @dd', N'@dd datetime', @dd = '2019-08-25';`. – Dan Guzman Aug 25 '19 at 11:42
  • 1
    Just to add on to what @Dan-Guzman said, you get a different plan for query 1 because the optimizer does not know the value of '@dd' before it compiles a plan. Here is a quick page to see an example: http://www.connectsql.com/2012/07/sql-server-how-local-variables-can.html – JMabee Aug 25 '19 at 11:46

1 Answers1

0

Implicit conversion of datetime literals is required because T-SQL doesn't have syntax for a datetime literal. This is not a performance concern.

Implicit conversion is a concern in cases where it results in a non-sargable expression due to data type precedence (e.g. comparing an int literal with a varchar column). Applying functions like YEAR to columns also results in a non-sargable expression because the function result must be evaluated before the comparison.

The issue in your queries is that you are using a variable instead of parameter. You should get the same performance as the literal with:

select * from Records where Created > @dd OPTION(RECOMPILE);

or a parameterized query (assuming the cached plan does not already exist):

EXEC sp_executesql N'select * from Records where Created > @dd'
    , N'@dd datetime'
    , @dd = '2019-08-25';

With the literal, parameter, and OPTION(RECOMPLE) with a variable, SQL Server uses the statistics histogram (if one exists) during the initial compilation to better estimate the number of rows that may be returned. In the case of a variable, SQL Server uses average density statistics to estimate the number of rows. Different plans may be used as a result, especially with skewed data distribution.

Dan Guzman
  • 43,250
  • 3
  • 46
  • 71
  • OPTION(Recompile) is new to me. Thank you very much. – Bendom Aug 26 '19 at 11:28
  • @Bendom, note that `OPTION(RECOMPILE)` should be used judiciously. If the query is executed often (e.g. many times per second), the compilation costs can outweigh the gains of a better plans, which is why SQL Server caches execution plans in the first place. – Dan Guzman Aug 26 '19 at 11:33