4

Do T-SQL queries in SQL Server support short-circuiting?

For instance, I have a situation where I have two database and I'm comparing data between the two tables to match and copy some info across. In one table, the "ID" field will always have leading zeros (such as "000000001234"), and in the other table, the ID field may or may not have leading zeros (might be "000000001234" or "1234").

So my query to match the two is something like: select * from table1 where table1.ID LIKE '%1234'

To speed things up, I'm thinking of adding an OR before the like that just says: table1.ID = table2.ID to handle the case where both ID's have the padded zeros and are equal.

Will doing so speed up the query by matching items on the "=" and not evaluating the LIKE for every single row (will it short circuit and skip the LIKE)?

Sam Schutte
  • 6,666
  • 6
  • 44
  • 54

7 Answers7

8

SQL Server does NOT short circuit where conditions. it can't since it's a cost based system: How SQL Server short-circuits WHERE condition evaluation .

George Stocker
  • 57,289
  • 29
  • 176
  • 237
Mladen Prajdic
  • 15,457
  • 2
  • 43
  • 51
7

You could add a computed column to the table. Then, index the computed column and use that column in the join.

Ex:

Alter Table Table1 Add PaddedId As Right('000000000000' + Id, 12)
Create Index idx_WhateverIndexNameYouWant On Table1(PaddedId)

Then your query would be...

select * from table1 where table1.PaddedID ='000000001234'

This will use the index you just created to quickly return the row.

George Mastros
  • 24,112
  • 4
  • 51
  • 59
3

Just in case it's useful, as the linked page in Mladen Prajdic's anwer explains, CASE clauses are short-circuit evaluated.

Daniel Renshaw
  • 33,729
  • 8
  • 75
  • 94
3

You want to make sure that at least one of the tables is using its actual data type for the IDs and that it can use an index seek if possible. It depends on the selectivity of your query and the rate of matches though to determine which one should be converted to the other. If you know that you have to scan through the entire first table, then you can't use a seek anyway and you should convert that ID to the data type of the other table.

To make sure that you can use indexes, also avoid LIKE. As an example, it's much better to have:

WHERE
     T1.ID = CAST(T2.ID AS VARCHAR) OR
     T1.ID = RIGHT('0000000000' + CAST(T2.ID AS VARCHAR), 10)

than:

WHERE
     T1.ID LIKE '%' + CAST(T2.ID AS VARCHAR)

As Steven A. Lowe mentioned, the second query might be inaccurate as well.

If you are going to be using all of the rows from T1 though (in other words a LEFT OUTER JOIN to T2) then you might be better off with:

WHERE
     CAST(T1.ID AS INT) = T2.ID

Do some query plans with each method if you're not sure and see what works best.

The absolute best route to go though is as others have suggested and change the data type of the tables to match if that's at all possible. Even if you can't do it before this project is due, put it on your "to do" list for the near future.

Tom H
  • 46,766
  • 14
  • 87
  • 128
3

How about,

table1WithZero.ID = REPLICATE('0', 12-len(table2.ID))+table2.ID

In this case, it should able to use the index on the table1

Dennis C
  • 24,511
  • 12
  • 71
  • 99
1

fix the database to be consistent

select * from table1 where table1.ID LIKE '%1234'

will match '1234', '01234', '00000000001234', but also '999991234'. Using LIKE pretty much guarantees an index scan (assuming table1.ID is indexed!). Cleaning up the data will improve performance significantly.

if cleaning up the data is not possible, write a user-defined function (UDF) to strip off leading zeros, e.g.

select * from table1 where dbo.udfStripLeadingZeros(table1.ID) = '1234'

this may not improve performance (since the function will have to run for each row) but it will eliminate false matches and make the intent of the query more obvious

EDIT: Tom H's suggestion to CAST to an integer would be best, if that is possible.

Steven A. Lowe
  • 60,273
  • 18
  • 132
  • 202
  • like does not guarentee table scan like '%1234' is a table scan. like '1234%' is not a table scan, assuming the column is indexed it will use the index to find 1234... – JoshBerke Dec 19 '08 at 15:56
  • @Josh: i said "index scan", not "table scan". – Steven A. Lowe Dec 19 '08 at 17:30
  • @Steven yep I see no idea why but when I read it first time I could have sworn it said table...I'm getting old I think;-) – JoshBerke Dec 21 '08 at 18:11
  • But either way I think prefixing with a wild card character would result in a table scan maybee that's why I got them confused. – JoshBerke Dec 21 '08 at 18:12
1

If the ID is purely numeric (as your example), I would reccomend (if possible) changing that field to a number type instead. If the database is allready in use it might be hard to change the type though.

Stein G. Strindhaug
  • 5,077
  • 2
  • 28
  • 41