2

Say I've a query like this

SELECT DISTINCT customer_name 
FROM borrower 
WHERE customer_name in (SELECT customer_name FROM depositor)

We can consider above as two queries

A

SELECT DISTINCT customer_name 
FROM borrower 
WHERE customer_name

B

SELECT customer_name 
FROM depositor

Which executes first? Is this rule the same for all sub queries?

What types of sub queries can we write? Give me some pointers to get started with a tutorial link.

  • 1
    Some pointers? What about the manual? http://dev.mysql.com/doc/refman/5.5/en/index.html –  Sep 22 '12 at 11:26
  • 1
    You could improve the IN with a EXISTS ... such as WHERE EXISTS (SELECT 1 FROM depositor AS b WHERE b.customer_name = borrower.customer_name) – AlexP Sep 22 '12 at 14:18

1 Answers1

0

Edit as first answer was wrong. But explanaition was correct :|

A executes first. Applies "all" subqueries presented above as the subqueries are correlated.

+----+--------------------+-----------+----------------+---------------+---------------+---------+------+------+----------+--------------------------+
| id | select_type        | table     | type           | possible_keys | key           | key_len | ref  | rows | filtered | Extra                    |
+----+--------------------+-----------+----------------+---------------+---------------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY            | borrower  | index          | NULL          | customer_name | 52      | NULL |    6 |   100.00 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | depositor | index_subquery | customer_name | customer_name | 52      | func |    1 |   100.00 | Using index              |
+----+--------------------+-----------+----------------+---------------+---------------+---------+------+------+----------+--------------------------+

Btw. MySQL 6 will remove this problem and the selection type will become PRIMARY for both queries. See 1

tuergeist
  • 9,171
  • 3
  • 37
  • 58
  • Not downvoting as MySQL isn't really my area but is this really true? So you're implying that in MySQL all sub queries get materialized into a temporary table first? MySQL can't for example do a merge join here? – Martin Smith Sep 22 '12 at 11:32
  • MySQL is optimized for joins and won't optimize subqueries. Some subqueries can be written as destict selection to improve their speed. ´´SELECT col FROM t1 WHERE id_col IN (SELECT id_col2 FROM t2 WHERE condition); That statement can be rewritten as follows: SELECT DISTINCT col FROM t1, t2 WHERE t1.id_col = t2.id_col AND condition;´´ – tuergeist Sep 22 '12 at 11:43
  • @MartinSmith: AFAIK the optimizer in MySQL is so limited that it won't rewrite such a query. MySQL has historically been very bad with sub-selects (I think it's actually stated in the manual somewhere) –  Sep 22 '12 at 11:48
  • What if `depositor` has 1 million rows and `borrower` two? Are you saying that it will still evaluate the query on `depositor` first? – Martin Smith Sep 22 '12 at 11:49
  • MySQL will do as explained above. – tuergeist Sep 22 '12 at 11:52
  • Martin: No, I think that the opposite is true. Subqueries in `WHERE` are not evaluated first but a lookup is done for every row in the external table. So, the efficiency would be worse when there are many rows in `borrower` and few in `depositor`. MariaDB latest versions have quite a lot optimizations strategies included: [MariaDB 5.3 and 5.5: Semi-join subquery optimizations](https://kb.askmonty.org/en/semi-join-subquery-optimizations/) MySQL 5.6 will have, too, some optimizer improvements. – ypercubeᵀᴹ Sep 22 '12 at 14:29
  • @ypercube:Subqueries in WHERE are not evaluated first but a lookup is done for every row in the external table What do you mean by this ?Can you provide some more information. –  Sep 22 '12 at 16:36
  • @redsaw: See this answer at SO [Why would an IN condition be slower than “=” in sql?](http://stackoverflow.com/questions/3417074/why-would-an-in-condition-be-slower-than-in-sql/3417190#3417190) and the link to MySQL buglist. – ypercubeᵀᴹ Sep 22 '12 at 17:25
  • @ypercude:I could not under stand this ,can you explain this with example or in detail :Subqueries in WHERE are not evaluated first but a lookup is done for every row in the external table What do you mean by this ? –  Sep 23 '12 at 12:12