3

Say I have a query with subqueries in CASE...WHEN...THEN like this (simplified, as the real query I'm working on is little harder on the eyes):

SELECT
  CASE 
    WHEN (subquery1) = 1
      THEN (subquery2)
    WHEN (subquery1) > 1 AND (subquery3) = 1
      THEN (subquery4)
  END
FROM foo

All 4 subqueries are dependent subqueries.

Do all 4 subqueries get executed? Cause when I did an EXPLAIN all the subqueries are included in the plan.

Or do only subquery1 get executed first, and if that condition is met, subquery2 will be executed? If not, subquery3 will be executed to examine if the condition is met, and so on... When I re-wrote the query as a stored procedure to only execute the only relevant subqueries this way, there was a performance increase (time reduction). I'm curious to know if the original query was taking more time since it executes all the subqueries, regardless of whether the previous condition have been met.

I tried turn on the general log but the subqueries don't get logged individually. They're logged together as the whole query, so I can't tell which subquery was actually executed. Also tried looking at the MySQL docs on SUBQUERY but I haven't found anything yet.

GMB
  • 216,147
  • 25
  • 84
  • 135
Tien Phan
  • 55
  • 9
  • Have you tested it on your particular version of MySQL to see the performance characteristics? This kind of query is really asking a lot of the RDBMS. You should probably conditionally execute each of those queries independently, or combine with a `UNION`. – tadman Dec 09 '19 at 23:51
  • That's a very bad query (at least in terms of performance) if you have anything but an extremely small amount of data in your tables. I think this is most likely an XY problem - rather than this question, it might be better to ask a question about how to better write the query you're actually using so that all of those subqueries aren't needed. – Ken White Dec 10 '19 at 00:00

2 Answers2

3

I find that this is an interesting question.

Explain plans cannot really tell you the answer, since they are static, while condition evaluation occurs at runtime.

The documentation of the case expression does not give details on the order of evaluation of the when conditions. The documentation of the case statement, however, states:

each WHEN clause search_condition expression is evaluated until one is true, at which point its corresponding THEN clause statement_list executes. If no search_condition is equal, the ELSE clause statement_list executes, if there is one.

This would tend to indicate that, once a condition is satisfied, further conditions are not checked. But your code contains case expressions, not case statements (which belong to stored procedures).

For what it's worth, here is a (over simplified) test scenario with a case expression. Consider the following code:

select case 
    when (select 1) = 1 then 'should stop here'
    when (select 1 union all select 2) = 1 then 'should not get there, or it will die'
end t

The first condition evaluates as true. The second condition, if evaluated, would raise runtime error Subquery returns more than 1 row.

When we run the query in this db fiddle, we get the expected result, and no runtime error happens:

| t                |
| ---------------- |
| should stop here |

So this also tend to indicate that subqueries are evaluated sequentially, and that MySQL stops evaluating conditions as soon as possible. However, please don't it for granted: in absence of a clear statement in the documentation, this is just empiric!

GMB
  • 216,147
  • 25
  • 84
  • 135
  • 1
    Just recently I tried to help with [this](https://stackoverflow.com/questions/59147909/another-1054-unknown-column-in-field-list-mystery/59148428) and found that in case of error (missing column) in some parts of case expression it is actually gives error even when that part ox expression is not reachable. – fifonik Dec 10 '19 at 01:37
  • @fifonik could it be that, since "missing column" is NOT a runtime error and thus caught before execution? And because of that it's not indicative of which subquery occurs during execution. Meanwhile, "subquery returns more than 1 row" IS a runtime error, and can only be caught during execution. – Tien Phan Dec 10 '19 at 03:42
  • See my answer below. The missed column error was not in WHEN, but in THEN section of CASE expression. I have not expected to see the error. I still hope I made some mistake in my code :) – fifonik Dec 10 '19 at 04:03
  • @fifonik: Tien Phan is correct, a missing column error is a compile error, not a runtime error. It is raised *before* the statement is executed. – GMB Dec 10 '19 at 07:10
  • A guess: The Optimizer could completely evaluate `(select 1 union all select 2)` even before evaluating the main query, thereby prematurely raising the error message. We need the OP to show us the subqueries. – Rick James May 19 '22 at 22:19
1

I've found CASE expression evaluation is tricky. Here is example:

CREATE TABLE `tt` (
    `id` INT(11) NULL DEFAULT NULL
);

INSERT INTO `tt` (`id`) VALUES (1);

DELIMITER $$

CREATE FUNCTION `ff1`(`f` VARCHAR(15)) RETURNS int(11)
BEGIN
    -- Case EXPRESSION
    RETURN CASE
        WHEN f = 'id' THEN (SELECT id FROM tt LIMIT 1)
        WHEN f = 'z'  THEN (SELECT z/*non-existing column*/ FROM tt LIMIT 1)
    END;
END$$

CREATE FUNCTION `ff2`(`f` VARCHAR(15)) RETURNS int(11)
BEGIN
    -- Case CLAUSE
    CASE
        WHEN f = 'id' THEN SELECT id INTO @result FROM tt LIMIT 1;
        WHEN f = 'z'  THEN SELECT z/*non-existing column*/ INTO @result FROM tt LIMIT 1;
    END CASE;
    RETURN @result;
END$$

DELIMITER ;

SELECT ff1('id'); -- error 'Unknown column z...'
SELECT ff1('z');  -- error 'Unknown column z...'
SELECT ff2('id'); -- 1
SELECT ff2('z');  -- error 'Unknown column z...'
fifonik
  • 1,556
  • 1
  • 10
  • 18