0

Is there something wrong querying like this:

select * from category_cross_selling
where 'SOMEID' in (f_categories_from, f_categories_to);

instead of like this?:

select * from category_cross_selling
where f_categories_from = 'SOMEID' or f_categories_to = 'SOMEID';

I do this sometimes, but I don't know if it is bad practice or if I break the optimizer engine of MySQL or I don't know, but people have told me to stop doing it.

I like it because it is clear that it is the same id that I am looking for in any columns, and you don't duplicate constants, thus avoiding bugs.

EDIT: Notice that the reason I asked this question is because normally in IN searches, you put the column in the left side and N constants in the right side

santiago arizti
  • 4,175
  • 3
  • 37
  • 50

3 Answers3

0

Since both syntax are valid, the only reason not to use your version with in operator would be if the or version performed better. I would test both versions each time and see which wins.

Shadow
  • 33,525
  • 10
  • 51
  • 64
0

The two should have the same performance -- which is not good (I'll get to that in a moment). The in version has the advantage of being shorter and less prone to error, so don't be confused by it.

Probably the most performant way to write the query is:

select *
from category_cross_selling
where f_categories_from = 'SOMEID'
union all
select *
from category_cross_selling
where f_categories_from <> 'SOMEID' and -- may need to take `NULL` into account
      f_categories_to = 'SOMEID';

This version can take advantage of indexes on category_cross_selling(f_categories_from) and category_cross_selling(f_categories_to, f_categories_from). Both indexes are needed, each for one of the subqueries.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
-1

Enable statistics running below queries.

SET STATISTICS IO ON;
SET STATISTICS TIME ON;

Now click on Include Actual Execution Plan(or Ctrl+M). Now run both queries at same time. You would be able to see actual execution plan and estimated execution plan for both queries in Execution plan section. Also look at messages section to see how many logical reads has been done and other parts. From this you can see what is the difference plans for those queries. If both are same there is no difference at all.

Update

I have seen the properties of table scan in Argument section from execution plan for both queries and found that even if you write 'SOMEID' in (f_categories_from, f_categories_to) SQL engine will convert that into f_categories_from = 'SOMEID' or f_categories_to = 'SOMEID' internally as below. So there is no difference at all.

For 'SOMEID' in (f_categories_from, f_categories_to):

Argument: OBJECT:([Database].[dbo].[TestTable]), WHERE:('SOMEID'=[TestTable].[f_categories_to] OR 'SOMEID'=[TestTable].[f_categories_from])

For f_categories_from = 'SOMEID' or f_categories_to = 'SOMEID':

Argument: OBJECT:([Database].[dbo].[TestTable]), WHERE:([TestTable].[f_categories_from]='SOMEID' OR [TestTable].[f_categories_to]='SOMEID')