1

Suppose we have a table such as

CREATE TABLE test{
title VARCHAR(32),
city VARCHAR(32),
description TEXT
...

And in a query say we have

SELECT * FROM test WHERE MATCH(title, description) AGAINST('xyz' IN NATURAL LANGUAGE MODE) AND city = 'ABC';

Will MySQL know to use the "city" condition first, or should we be more explicit and use a subquery?

Jiulin Teng
  • 299
  • 3
  • 8
  • It depends on indices and statistics.. [**view the query plan**](https://dev.mysql.com/doc/workbench/en/wb-performance-explain.html) to see what MySQL actually selected. – user2864740 Dec 20 '19 at 20:01
  • Wrt multiple indices: https://stackoverflow.com/a/12222699/2864740 ; the query planner still might choose to *not* use [additional] indices, based on carnality estimates. This is why viewing the actual query plan (which can change over time!) is important. – user2864740 Dec 20 '19 at 20:07
  • See https://stackoverflow.com/q/12427110/2864740 for some considerations. Probably a dup. question too.. – user2864740 Dec 20 '19 at 20:10
  • Anyway, back to the title: it is unlikely that a sub-query will be an more efficient. A non-dependent sub-query is logically "unwrapped" anyway. YMMV on if this somehow triggers a behavior change in the query planner. See first comment. – user2864740 Dec 20 '19 at 20:13
  • Please provide `EXPLAIN SELECT` both for the query as stated, and with the `AND` clauses swapped. (The text version pasted in will suffice.) – Rick James Dec 22 '19 at 00:52
  • Both show the same: id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE art NULL fulltext art_fkey2,art_ft art_ft 0 const 1 33.33 Using where; Ft_hints: sorted – Jiulin Teng Dec 22 '19 at 09:39

1 Answers1

0

From looking at the code, the MATCH expression will be evaluated already in the query optimization phase. This means that all the rows containing 'xyz' will be identified before the condition on city is considered. (At least, this is how I understand it to for work for InnoDB. I do not know the details of how this is implemented in MyISAM.) During query execution, when the WHERE clause is evaluated, expressions are evaluated from left to right. (This is the current implementation, and may change in future versions.) Since MATCH scores have already been computed, at this point one just evaluates whether they are non-zero.

If your city column is indexed, the query optimizer may choose to use this index to scan only rows from the given city, and compare the MATCH scores for just these rows. However, all rows containing 'xyz' are still first identified. The EXPLAIN output for the query will show if the index is used.

I doubt that using a subquery will help anything. If the subquery is correlated, you may even risk that the full-text search is performed multiple times.

Øystein Grøvlen
  • 1,266
  • 6
  • 8
  • So if `city` is put on the left before the MATCH there could be an advantage? I'm not sure about the EXPLAIN, but the other condition is indeed indexed. EXPLAIN identifies it as a "possible key". – Jiulin Teng Dec 20 '19 at 22:25
  • Hmmm... I have long believed that (1) the order in `WHERE` does not matter (except for extreme cases such as a subquery), and (2) `MATCH` is always performed first. (I think MyISAM stated that it would occur first.) – Rick James Dec 22 '19 at 00:52
  • Putting city first, will only give a minor improvement, since most of the "heavy-lifting" for MATCH will have already been done during query optimization. – Øystein Grøvlen Dec 22 '19 at 15:04