171

Let's say I have a table called PEOPLE having three columns, ID, LastName, and FirstName. None of these columns are indexed. LastName is more unique, and FirstName is less unique.

If I do two searches:

select * from PEOPLE where FirstName="F" and LastName="L"
select * from PEOPLE where LastName="L" and FirstName="F"

My belief is the second one is faster because the more unique criterion (LastName) comes first in the where clause, and records will get eliminated more efficiently. I don't think the optimizer is smart enough to optimize the first SQL query.

Is my understanding correct?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ZZZ
  • 3,574
  • 10
  • 34
  • 37
  • 12
    No, that order doesn't matter - any decent query optimizer will look at **all** the WHERE clauses and figure out the most efficient way to satisfy that query – marc_s Jul 11 '12 at 15:50
  • 3
    What were your observations when you ran these two statements? What did the execution plans look like? – Conrad Frix Jul 11 '12 at 15:51
  • 3
    Are you referring to a specific RDBMS? There are indeed differences. – Bjoern Jul 11 '12 at 15:52
  • 7
    Possible duplicate of [SQL - Does the order of WHERE conditions matter?](http://stackoverflow.com/questions/3152182/sql-does-the-order-of-where-conditions-matter?rq=1) – Conrad Frix Jul 11 '12 at 15:55
  • 5
    Also [Execution order of conditions in SQL 'where' clause](http://stackoverflow.com/questions/340139/execution-order-of-conditions-in-sql-where-clause?rq=1) and [Does the order of columns in a WHERE clause matter?](http://stackoverflow.com/questions/642784/does-the-order-of-columns-in-a-where-clause-matter) and [SQL question: Does the order of the WHERE clause make a difference?](http://stackoverflow.com/questions/1458060/sql-question-does-the-order-of-the-where-clause-make-a-difference?rq=1) – Conrad Frix Jul 11 '12 at 15:58
  • 1
    @Conrad Frix: Hi, my question is kind of different, if one column is indexed, I believe optimizer is smart enough to do some optimization, if none of them is indexed, I doubt if optimizer is smart enough to tell which one is more unique – ZZZ Jul 11 '12 at 16:05
  • If your query is going to be saved then it is a good idea to logically order your conditions either way. This way if you or someone else has to modify them later the uniqueness of the columns can be inferred from the code. – Wayne Jan 09 '15 at 17:29
  • 2
    Possible duplicate of [SQL - Does the order of WHERE conditions matter?](https://stackoverflow.com/questions/3152182/sql-does-the-order-of-where-conditions-matter) – Jens Schauder Feb 06 '18 at 08:43

6 Answers6

140

No, that order doesn't matter (or at least: shouldn't matter).

Any decent query optimizer will look at all the parts of the WHERE clause and figure out the most efficient way to satisfy that query.

I know the SQL Server query optimizer will pick a suitable index - no matter which order you have your two conditions in. I assume other RDBMS will have similar strategies.

What does matter is whether or not you have a suitable index for this!

In the case of SQL Server, it will likely use an index if you have:

  • an index on (LastName, FirstName)
  • an index on (FirstName, LastName)
  • an index on just (LastName), or just (FirstName) (or both)

On the other hand - again for SQL Server - if you use SELECT * to grab all columns from a table, and the table is rather small, then there's a good chance the query optimizer will just do a table (or clustered index) scan instead of using an index (because the lookup into the full data page to get all other columns just gets too expensive very quickly).

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 1
    If there's no index(es) op could be right, depending on the data. Course doing somnething like this without indexes, would be a strange decision... – Tony Hopkinson Jul 11 '12 at 15:57
  • 1
    @TonyHopkinson: I don't think so - even without indexes I doubt there's any difference at all. After all: without indexes, what else but a full table scan can the RDBMS do, really?? – marc_s Jul 11 '12 at 15:58
  • 4
    Interesting side note with SQL server,apparently the order of NOT EXISTS within predicates can actually influence plan creation: http://bradsruminations.blogspot.com/2010/04/looking-under-hood.html – Justin Swartsel Jul 11 '12 at 16:03
  • 1
    @marc_s Depends on how the compares work out if the first one is generally false the second isn't required. Saying that eve if theres some inherrent rule to the data (as in surnames are less common than forenames), it can't be used by a DBMS anyway, unless it has stats or indexes. If there was just one condition there would be no difference. Not sure how significant the saving would be either, unless teh condition was complex or badly written not very I should think. – Tony Hopkinson Jul 12 '12 at 11:45
  • 4
    A strange thing is that for the first execution of a query the order of conditions in a WHERE clause DOES MATTER! I had two conditions, something like:`WHERE T1.col_1/T2.col_2 > 10 AND T2.col_2 <> 0` and got a `DIVIDE BY 0` error. After I switched the order the conditions the query executed succesfully. Then I switched the order back so I would expect to get the error again, but this time it worked!In the end my conclusion was that for the first run the order does matter, until the execution plan is built.After that the order doesn't matter 'cause the optimizer/exec plan will take care of it – Radu Gheorghiu Jun 23 '14 at 20:13
  • 1
    Which RDMS were you using when finding that ? I am very interested in avoiding such cases ! – Chucky Apr 09 '15 at 08:38
  • 1
    It can matter. We have a query right now with a complex view with expensive calculations where `SQL Server 2008` seems not to be able to optimize it in all sorting cases. We must manually optimize by using a subselect on the indexed cheap columns and in outer select filter on the expensive calculate properties. – djmj Sep 04 '15 at 14:38
  • 2
    I like that you said, "...or at least: shouldn't matter" - I totally agree. Sometimes it does matter, unfortunately. I have seen cases where SQL was too complex for the optimizer to handle and, and things like column order and table join order did make a difference. It depends on the RDBMS, the SQL statement complexity, and even the release. Very complex SQL can result in bad optimizer decisions or use of hard-coded defaults in the optimizer code. – Victor Di Leo Aug 03 '18 at 17:20
  • 1
    @RaduGheorghiu: you probably got a different execution plan for the third execution because some statistics got created. – Razvan Socol May 17 '20 at 17:09
  • 1
    @RazvanSocol I'm not 100% sure about that, it used to be (as far as I can remember from 6 years ago) a development server with backup restores to usually be used by 1 developer (me in this case). Data was never really added to the database, the backups were there to check bugs in production, so data change was never an option. Stats creation was not done in a automatic way since this was a DEV environment so no particular effort done for it and also no data was inserted, at least not at the 20% mark that would trigger an automatic stats update, if I remember my percentages right. – Radu Gheorghiu May 17 '20 at 17:53
  • @marc_s does writing where clause after order by class effects the performance? Specifically, if I have a sqlite view which has order by clause, and during its usage if I add a where clause like `SELECT * FROM MyView WHERE id = myid`. – Sourav Kannantha B Nov 17 '21 at 14:09
  • @SouravKannanthaB the structure **must be** `SELECT ... FROM ... JOIN(s) .... WHERE ..... ORDER BY` - you **cannot** put the `WHERE` after the `ORDER BY` – marc_s Nov 17 '21 at 14:24
  • `CREATE VIEW MyView AS SELECT ... ORDER BY ...`, `SELECT * FROM MyView WHERE id = myid`. Is this equivalent to, `SELECT ... WHERE id = myid ORDER BY ...` in terms of speed. – Sourav Kannantha B Nov 17 '21 at 16:34
  • 1
    @SouravKannanthaB: run it, measure it --> you'll know :-) As a general rule: a **view** doesn't *temporarily store* the data (or keep it in memory or anything); it's just a stored query; when selecting from a view, your SQL is combined with the stored SQL query for the view - in the end, what SQL Server actually executes might end up being identical (or very close to identical). Also, in most cases, you **cannot** have an `ORDER BY` inside a view definition – marc_s Nov 17 '21 at 16:35
27

The order of WHERE clauses should not make a difference in a database that conforms to the SQL standard. The order of evaluation is not guaranteed in most databases.

Do not think that SQL cares about the order. The following generates an error in SQL Server:

select *
from INFORMATION_SCHEMA.TABLES
where ISNUMERIC(table_name) = 1 and CAST(table_name as int) <> 0

If the first part of this clause were executed first, then only numeric table names would be cast as integers. However, it fails, providing a clear example that SQL Server (as with other databases) does not care about the order of clauses in the WHERE statement.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 1
    What does that query causing an error have to do with the order of WHERE predicate evaluation? – Jim Jul 11 '12 at 19:12
  • 12
    @Jim If `ISNUMERIC(table_name) = 1` was evaluated first, then `CAST` would only ever be called for numeric table names. But since it's not evaluated first, `CAST` is evaluated for non-numeric table names, as well, causing the error message. – hibbelig Feb 13 '13 at 10:43
  • 4
    Excellent clarification – neeohw Jul 19 '17 at 09:02
  • 1
    Just to be sure I checked if swapping the conditions would cause SQL server to handle them the other way around, but it fails both ways around. I think this can mean either of two things: (1) It's not optimizing as well as it could or (2) It's a compile-time error and SQL doesn't even start to try to compare anything, bailing out preliminary. My guess is that it's nr. 2. – Louis Somers Nov 24 '17 at 08:47
  • If all table_names are numeric, does this sql work properly? – zeusalmighty Dec 10 '21 at 15:18
11

ANSI SQL Draft 2003 5WD-01-Framework-2003-09.pdf

6.3.3.3 Rule evaluation order

...

Where the precedence is not determined by the Formats or by parentheses, effective evaluation of expressions is generally performed from left to right. However, it is implementation-dependent whether expressions are actually evaluated left to right, particularly when operands or operators might cause conditions to be raised or if the results of the expressions can be determined without completely evaluating all parts of the expression.

copied from here

Community
  • 1
  • 1
03Usr
  • 3,335
  • 6
  • 37
  • 63
4

No, all the RDBMs first start by analysing the query and optimize it by reordering your where clause.

Depending on which RDBM you are you using can display what is the result of the analyse (search for explain plan in oracle for instance)

M.

poussma
  • 7,033
  • 3
  • 43
  • 68
2

It's true as far as it goes, assuming the names aren't indexed. Different data would make it wrong though. In order to find out which way to do it, which could differ every time, the DBMS would have to run a distinct count query for each column and compare the numbers, that would cost more than just shrugging and getting on with it.

Tony Hopkinson
  • 20,172
  • 3
  • 31
  • 39
2

Original OP statement

My belief is the second one is faster because the more unique criterion (LastName) comes first in >the where clause, and records will get eliminated more efficiently. I don't think the optimizer is >smart enough to optimize the first sql.

I guess you are confusing this with selecting the order of columns while creating the indexes where you have to put the more selective columns first than second most selective and so on.

BTW, for the above two query SQL server optimizer will not do any optimization but will use Trivila plan as long as the total cost of the plan is less than parallelism threshold cost.

Gulli Meel
  • 891
  • 4
  • 6