1

I assume the DB engine will make the right decisions, but just wanted to make sure...

posts
id, title

comments
id, text, post_id

Querying for comments with posts with a title of xxx:

SELECT * FROM posts, comments 
WHERE posts.id = comments.post_id 
AND posts.title = 'xxx';

Does it: A: first join both tables and the filter by title or B: Get post ids, then filter comments by those ids, then join

I hope and expect it to be B, which would resolve above query to:

  1. SELECT id, title FROM posts WHERE title = 'xxx'; // save id and title, create ids array from all rows
  2. SELECT * FROM comments WHERE post_id IN ids
  3. join the result of 2. with the titles of 1.
Tom J Muthirenthi
  • 3,028
  • 7
  • 40
  • 60
Chris
  • 13,100
  • 23
  • 79
  • 162
  • 1
    Why not to look at execution plan and see what actually happens? It's impossible to answer without knowing what indexes are on your table and how big these are. – Evaldas Buinauskas Mar 01 '17 at 10:49
  • 1
    http://stackoverflow.com/questions/24127932/mysql-query-clause-execution-order , It seems the from clause will be executed before the where clause and the select last. If you are concerned about performance make sure you have indexes on posts_id and comments.post_id. And I would change the outdated implict join to an explicit join, – P.Salmon Mar 01 '17 at 10:51
  • If you precede your query with the keyword `explain` mysqld will tell you what it's doing. And further if you then do a `show warnings` mysql will show you exactly what query it executed. – slowko Mar 01 '17 at 10:52

1 Answers1

1

Although it (generally) does not make a difference for performance, you should always use proper, explicit JOIN syntax. Simple rule: Never use commas in the FROM clause.

Your query should be written as:

SELECT *
FROM posts p JOIN
     comments c
     ON p.id = c.post_id 
WHERE p.title = 'xxx';

Your question is about execution. The SQL language was designed to be descriptive, not procedural. That means that a SQL query describes the result set being produced. It does not specify the exact steps. In general, processing a query consists of three steps:

  1. Parsing the query (and determining the types, table, and column references).
  2. Optimizing the query for the best execution plan.
  3. Executing the query.

You are asking about the second step. The optimizer will choose the best execution plan. It will take into account table statistics, indexes, and available resources for processing the query.

In the absence of indexes, the optimizer (particularly for MySQL) would probably use nested loops joins for this type of query. It would scan the rows of the posts table, filter out those that do not match the where clause, and then loop through the comments table to get the matching rows.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786