How does SQL evaluate nested queries? Why are joins so much better if they mean the same thing

Question

I have 2 queries which should produce exactly the same result set

First query:

SELECT * 
FROM rms_tests where Id in (SELECT TestId 
                            FROm rms_test_results 
                             where RegressionResultID = 106967 
                               and Failed is NULL);

Second query:

SELECT t.* 
FROM rms_test_results tr 
   LEFT JOIN rms_tests t ON tr.testid=t.id 
where tr.regressionresultId = 106967 
and tr.failed is NULL;

However the second one takes 0.008 seconds while the first one never finished and hangs my "MySQL Query Browser".

How does SQL evaluate the queries of type one that causes them to blow up and hang the browser forever.

NOTE: rms_test_results is incredibly large and many times larger than rms_tests. also as you can see they are related on rms_tests.id = rms_test_result.testid

Sorry if this was covered somewhere.

++++++++++++++++++++++++++++++++++++++++++++ PS I am just really confused because If I break up the query #1 into two of them and execute them one after another they take no time at all either:

SELECT TestId FROm rms_test_results where RegressionResultID = 106967 and Failed is NULL; SELECT * FROM rms_tests where Id in (1234,1235,.....);

You can use `EXPLAIN EXTENDED` to understand how MySQL optimized the query. — Kermit, Feb 04 '14 at 16:11
http://stackoverflow.com/questions/2577174/join-vs-sub-query — Sam, Feb 04 '14 at 16:14
@ypercube Can you please explain what is the difference and how that affects the "execution plan". Thank you — Nibblerglozer, Feb 05 '14 at 19:19
My bad, I reread the second query. They are actually equivalent (and you can replace the `LEFT JOIN` with simle (INNER) `JOIN`, it will be the same result. Semijoins written with `IN (SELECT ...)` like the first query are often not well optimized in MySQL versions 5.5 and previous, especially if the second table (the `rms_test_results` in your case) is large. — ypercubeᵀᴹ, Feb 05 '14 at 20:31
I wonder though what indexes you have in the tables. An index on `(regressionresultId, Failed, testId)` would be quite helpful for both queries. — ypercubeᵀᴹ, Feb 05 '14 at 20:32
MySQL's query optimizer is known to be extremely bad with sub-selects - one of the many deficiencies of MySQL (we just had a DELETE statement that took > 30 minutes using an IN but only 0.2 seconds using a join - needless to say that the other databases where I also ran IN version never took longer than 0.2 seconds). So sub-queries aren't in general "bad in SQL", they are just bad in MySQL (and besides a JOIN is not always a replacement for an IN query) — , Feb 05 '14 at 22:10

score -2 · Answer 1 · edited May 23 '17 at 12:11

-2

Joins can usually utilize a more efficient "execution plan", while nested queries, or "subqueries" usually cannot.

Further reading: Join vs. sub-query

edited May 23 '17 at 12:11

Community

1
1

answered Feb 04 '14 at 16:11

Sev09

883
2
12
27

2

An execution plan is a term used to describe the path for *all* queries; whether they use an explicit `JOIN` or not. – Kermit Feb 04 '14 at 16:11
@Sev09 What you say is not correct. Besides what FreshPrice points about what is an execution plan, sometimes subqueries and derived tables lead to more efficient execution plans. – ypercubeᵀᴹ Feb 04 '14 at 16:14
1

@ypercube, must be that W3 schools education showing. ;) – Sev09 Feb 04 '14 at 16:43
1

Come on, don't be hard on yourself. I suggest you edit the answer so it's not hard-wired. Say with *"Joins **usually** utilize..."* and *"... subqueries **often** lead to less efficient plans"*. You could also esearch and see how this (the way that subqueries are handled) has changed in recent version of MySQL (5.6 and MariaDB 5.5) which can often "pull out" tables from subqueries, as if the query had been written without subqueries. – ypercubeᵀᴹ Feb 05 '14 at 20:19

How does SQL evaluate nested queries? Why are joins so much better if they mean the same thing

1 Answers1