0

I have a query against the sakila database in MySQL 8(You can easily setup one from https://hub.docker.com/r/1maa/sakila).

select * from store _0 where (
    select address_id from (
        select * from address _3 where (
            address_id = (
                select address_id from (
                    select * from store _1
                    where store_id=_0.store_id
                ) _2
            )
        )
    ) _4
);

This query returns an empty table.

However, when I rewrite the innermost

                select address_id from (
                    select * from store _1
                    where store_id=_0.store_id
                ) _2

into:

                select address_id from store _1
                where store_id=_0.store_id

, it returns all 2 rows from store table, which I expect.

Why is this? It seems I just simplified a 2-step select into 1.

I also noticed that if I use in instead of = in address_id = .. condition, I can get the expected result.

What's the theory behind this?

The query is over-complicated because it's generated. So thank you for your advice of simplified queries but I just want to know why it does not give the correct result.

And you must have MySQL 8.0.14 and above to reproduce it, because earlier versions does not allow alias beyond 1 level.

Jack Lu
  • 139
  • 2
  • 12

3 Answers3

1

The reason you would need an in clause is because the sub select is returning multiple results.

Are you just trying to join the two tables? Why not use a join instead of the subselects?

SELECT s.* 
  FROM store s
  JOIN address a
    ON a.address_id = s.address_id
Jimeh
  • 367
  • 1
  • 6
  • 16
  • It's 1 row, not 2 rows that is returned from the subquery, though the final result is 2 rows. The query is generated, so I just want to know why it does not work. – Jack Lu Sep 29 '19 at 16:28
  • I ran an explain on your query vs using the in clause. When using equals it fails to match because "no matching row in const table", but I'm not sure why that would be. – Jimeh Sep 29 '19 at 16:49
  • 1
    See my answer. It is confirmed to be a bug. – Jack Lu Oct 02 '19 at 15:02
0
            select address_id from (
                select * from store _1
                where store_id=_0.store_id
            ) _2

In the first code snippet, you are selecting the columns from store _1 that have the equivalent store_id entries as store _0. If the comparison statement returns true, the address_id is retrieved from these columns, but what if no address_id entries are stored in these columns?

            select address_id from store _1
            where store_id=_0.store_id

In the second code snippet, you are selecting the address_id from store _1 where the comparison of store_ids of store _0 and _1 returns true. The scope for selecting the address_id in store _1 is broader than in the first code snippet, thus the address_id can be retrieved.

TL;DR The different results have everything to do with the scope of your searches and the structure of data in the relational database!

Jonathan
  • 1
  • 4
  • See the following MySQL manual page: http://dev.mysql.com/doc/refman/5.0/en/select.html "A select_expr can be given an alias using AS alias_name. The alias is used as the expression's column name and can be used in GROUP BY, ORDER BY, or HAVING clauses." (...) It is not permissible to refer to a column alias in a WHERE clause, because the column value might not yet be determined when the WHERE clause is executed. See Section B.5.4.4, “Problems with Column Aliases”. REFERENCE: https://stackoverflow.com/questions/153598/unknown-column-in-where-clause – Jonathan Sep 29 '19 at 18:15
  • Sorry but I don't think your argument is relevant. For both queries, MySQL can determine the columns returned. For the 1st query, it can determine the result of inner query is a row of `store` table, which contains an `address_id` column, then know that `select address_id` from it is valid. – Jack Lu Sep 30 '19 at 08:59
  • And your reference is about column alias, not table alias. – Jack Lu Sep 30 '19 at 09:01
0

I reported this to MySQL. This is confirmed to be an optimizer bug: https://bugs.mysql.com/bug.php?id=97063.

Jack Lu
  • 139
  • 2
  • 12