0

Going through a book, Learning SQL by Alan Beaulieu. On topic of inner joins, it tells that whatever be the order of tables in a INNER JOIN, results are same and gives reason as follows:

If you are confused about why all three versions of the account/employee/customer query yield the same results, keep in mind that SQL is a nonprocedural language, meaning that you describe what you want to retrieve and which database objects need to be involved, but it is up to the database server to determine how best to execute your query. Using statistics gathered from your database objects, the server must pick one of three tables as a starting point (the chosen table is thereafter known as the driving table), and then decide in which order to join the remaining tables. Therefore, the order in which tables appear in your from clause is not significant.

So does it imply that if statistics gathered from database objects change, then results would also change?

GMB
  • 216,147
  • 25
  • 84
  • 135
Mandroid
  • 6,200
  • 12
  • 64
  • 134
  • The result will not change. Some less advanced query optimizers might generate a plan that's slower, but still correct. –  Apr 13 '20 at 13:41
  • If by `results` you mean the execution plan, then yes. If by `results` you mean the returned result set, then no. – JNevill Apr 13 '20 at 13:43
  • Read a definition of join on. That "explanation" is irrelevant. x join y on c is y join x on c by simple consequence of the definition of join on--stick both orders of x & y into the definition. It is *because* of that that we could write either expression for the same result & an implementation could "rewrite" it although an implementation isn't limited to executing & optimizing only as a different SQL statement . – philipxy Apr 13 '20 at 20:01
  • Does this answer your question? [Does INNER JOIN performance depends on order of tables?](https://stackoverflow.com/questions/19612208/does-inner-join-performance-depends-on-order-of-tables) – philipxy Apr 13 '20 at 22:45
  • Before considering posting please read the manual & google any error message or many clear, concise & precise phrasings of your question/problem/goal, with & without your particular strings/names & site:stackoverflow.com & tags; read many answers. If you post a question, use one phrasing as title. Reflect your research. See [ask] & the voting arrow mouseover texts. – philipxy Apr 13 '20 at 22:46

1 Answers1

3

So does it imply that if statistics gathered from database objects change, then results would also change?

No. The same query will always produce the same results (provided, of course, that the underlying data is the same). What the author is explaining is that the database may choose a strategy or another to process the query (starting from one table or another, using a this or that algorithm to join the rows, and so on). That decision is made based on many factors, some of them being based on information that is available in the statistics.

The key point is that SQL is a declarative language, not a procedural language: you don't get to chose how the database handles the query, you just tell it what result you want.

However, regardless of the algorithm that the database chooses, the result is guaranteed to be consistent.

Note that there are edge case where the database does not guarantee that results are the same for consecutive executions of the same query (like a query without a row limiting clause but without an order by): it's the responsibility of the client to provide a query whose results are properly defined (the language does gives you enough rope to hang yourself, if you really want to).

GMB
  • 216,147
  • 25
  • 84
  • 135
  • 1
    . . It is not true that "The same query will always produce the same results". Consider the use of `fetch first 1 row` in a query with no `ORDER BY`. – Gordon Linoff Apr 13 '20 at 13:46
  • @GordonLinoff: agreed. I added a section to my answer to be explicit about this. – GMB Apr 13 '20 at 13:52
  • One could argue that the desired result is an arbitrarily picked row. So, even then the result is the same: an arbitrarily picked row :-) The order of the tables in the query doesn't change this. – Thorsten Kettner Apr 13 '20 at 13:57