0

Is there an "efficient" (in a single statement) way to get MySQL to only display columns that triggered results in the WHERE clause when multiple conditions are used?
I know this can be done by UNION joining SQL statements. I want to know specifically which column triggered the result, since both columns are displayed when either condition returns true. For example:

Table:
idx1       idx2
1          2
5          6

sql statement: SELECT idx1, idx2 FROM myTable WHERE idx1 IN(1,2,3) OR idx2 IN(4,5,6)

Obviously both columns are returned in the above 2 rows, being only one of the conditions are met for each. Is there a way I can get:

idx1       idx2
1          NULL
NULL       6
HappyBear
  • 13
  • 2

4 Answers4

2

You can move the conditions to the select, using a case expression:

SELECT (CASE WHEN idx1 IN (1, 2, 3) THEN idx1 END) as idx1,
       (CASE WHEN idx2 IN (4, 5, 6) THEN idx2 END) as idx2
FROM myTable
WHERE idx1 IN (1,2,3) OR idx2 IN (4, 5, 6);
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • WOW! Thanks for your lightning fast reply. I'm gonna look into the CASE function. Would the above be considered efficient? Seems very repetitive since I'm gonna have many values in the IN clause. Also: your not just moving, your also repeating it from WHERE. – HappyBear Mar 25 '19 at 17:52
  • This is the working solution that I ended up using for now and checked it. However, Bill has shown below that UNION is just as or even more "efficient" for this. – HappyBear Mar 27 '19 at 18:22
  • @HappyBear . . . `UNION ALL` may be more efficient if you have indexes on each column. Without indexes, this should be more efficient. – Gordon Linoff Mar 28 '19 at 22:19
1

Try using a CASE statement.

CASE WHEN idx1 IN (1,2,3) Then
    idx1
ELSE
    NULL
END AS MyColumnName
Jon
  • 31
  • 5
  • Apologies to @gordon, we posted at the same time. This link may be relevant: [If Then Else in SQL](https://stackoverflow.com/questions/63447/how-do-i-perform-an-if-then-in-an-sql-select?rq=1) – Jon Mar 25 '19 at 17:50
1

You can split up the query into two, and UNION them together:

SELECT idx1, NULL AS idx2
FROM myTable
WHERE idx1 IN (1,2,3)
UNION ALL
SELECT NULL, idx2
FROM myTable
WHERE idx2 IN (4,5,6);

Edit: I know you said you knew it could be done with UNION, but you were concerned about making it clear which rows were found with each condition.

The solution above should satisfy your requirement that the "other" column is shown as NULL, because we can put NULL into the select-list explicitly instead of naming the other column.

You should define the column aliases in the first query in the UNION. Column names of subsequent queries in the UNION will use the column aliases defined in the first query.


Re comments from @Kaii:

Index merge optimization doesn't help as often as you might expect. Here's a demo, tested on MySQL 8.0.14:

CREATE TABLE `mytable` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `idx1` int(11) DEFAULT NULL,
  `idx2` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  KEY `idx1` (`idx1`),
  KEY `idx2` (`idx2`)
);

I filled it with 1M+ rows, with random values between 0 and 50 for idx1 and idx2.

select count(*) from mytable;
+----------+
| count(*) |
+----------+
|  1048576 |
+----------+

The query using OR does not do an index-merge, it does a table-scan (type=ALL):

explain SELECT idx1, idx2 FROM myTable WHERE idx1 IN(1,2,3) OR idx2 IN(4,5,6);
+----+-------------+---------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | myTable | NULL       | ALL  | idx1,idx2     | NULL | NULL    | NULL | 1046577 |    51.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+---------+----------+-------------+

Whereas my solution with UNION does use both indexes, and does not use a temp table:

explain SELECT idx1, NULL AS idx2 FROM myTable WHERE idx1 IN (1,2,3) UNION ALL SELECT NULL, idx2 FROM myTable WHERE idx2 IN (4,5,6);
+----+-------------+---------+------------+-------+---------------+------+---------+------+--------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key  | key_len | ref  | rows   | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------+------+---------+------+--------+----------+--------------------------+
|  1 | PRIMARY     | myTable | NULL       | range | idx1          | idx1 | 5       | NULL | 113264 |   100.00 | Using where; Using index |
|  2 | UNION       | myTable | NULL       | range | idx2          | idx2 | 5       | NULL | 112678 |   100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+--------+----------+--------------------------+

I profiled both queries, and find that the UNION query is quite a bit faster in this case.

show profiles;
+----------+------------+-----------------------------------------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                                                       |
+----------+------------+-----------------------------------------------------------------------------------------------------------------------------+
|        1 | 0.22477500 | SELECT idx1, idx2 FROM myTable WHERE idx1 IN(1,2,3) OR idx2 IN(4,5,6)                                                       |
|        2 | 0.05573200 | SELECT idx1, NULL AS idx2 FROM myTable WHERE idx1 IN (1,2,3) UNION ALL SELECT NULL, idx2 FROM myTable WHERE idx2 IN (4,5,6) |
+----------+------------+-----------------------------------------------------------------------------------------------------------------------------+
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • OP explicitly excluded UNION as an option as you can read in the question. – Kaii Mar 25 '19 at 18:08
  • They did not exclude UNION as an option. They said they knew it could be done with UNION. But they may have overlooked using NULL explicitly in each select-list. – Bill Karwin Mar 25 '19 at 18:16
  • You're right. But the UNION is not as efficient as the other solution. Removed my downvote. – Kaii Mar 25 '19 at 22:09
  • Really? Using a UNION means that if there's an index on column `idx1` and another index on column `idx2`, then both indexes will be used. It's a common way of optimizing a condition with `OR` between two the terms referencing different columns. Trying to combine them in one query will cause a table-scan. – Bill Karwin Mar 25 '19 at 23:05
  • Will not cause FULL SCAN. MySQL knows index merge operations since 5.7 https://dev.mysql.com/doc/refman/8.0/en/index-merge-optimization.html while what you say is true for older versions, in current mysql you don't need to split your query. It harms more than it helps optimizing. – Kaii Mar 26 '19 at 07:24
  • With Union, two individual queries will be executed, and the resultsets need to be merged, i guess by storing in a temp table. Will perform bad for large resultsets or when trying to optimize order/group by using indexes. In the other variant, only two indexes are queried, one of which might contain a sorting for group/order by. The result is merged internally before executing the row lookup. So yeah, really. – Kaii Mar 26 '19 at 07:40
  • @Kaii, see tests above. – Bill Karwin Mar 26 '19 at 14:33
  • Bill's DURATION numbers seem to prove his point and that's w/o the extra possible load of adding the CASE function. Great discussion & learning a lot from it! (I'm adding UNION to the list of tags) – HappyBear Mar 26 '19 at 17:23
  • You are right, nice catch. I guess due to the range scan over an index with low selectivity the optimizer seems to decide to do a full table scan instead. I guess it depends on the number of distinctive values for the index? I learned something today! Here, take all the upvotes i can give. – Kaii Mar 26 '19 at 17:25
  • I wonder what happens when you sort or group by any of the foreign keys? I guess temp table + filesort for BOTH variants? – Kaii Mar 26 '19 at 17:26
  • 1
    `ORDER BY` can skip the filesort if the order of the index it uses to read the rows is the same as the order you want the data returned. `GROUP BY` might skip the temp table if the query reads rows in index order that naturally groups the rows. Like if you `COUNT()` by group, and you make sure to read rows in group order, it doesn't have to keep a temp table for tallying. Some other types of aggregation may still need to use a temp table (like `AVG()`). This all has nothing to do with using `OR` vs. `UNION`. – Bill Karwin Mar 26 '19 at 18:08
  • Interesting. Didn't know AVG needs a tmp table, but it makes a lot of sense. Seems i should take a look at your book again.. ;) – Kaii Mar 26 '19 at 18:59
0

I would do:

select
  case when idx1 in (1, 2, 3) then idx1 end as idx1,
  case when idx2 in (4, 5, 6) then idx2 end as idx2
from t
The Impaler
  • 45,731
  • 9
  • 39
  • 76