10

Using MySQL 5.6, this request

SELECT foo
FROM bar
WHERE groupId = '1'
  AND MATCH (foo) AGAINST ('"myQuery"' IN BOOLEAN MODE);

and

SELECT foo
FROM bar
WHERE groupId = '1'
  AND foo like '%myQuery%';

returns both correct results, but when I combine the two with:

SELECT foo
FROM bar
WHERE groupId = '1'
  AND (
    MATCH (foo) AGAINST ('"myQuery"' IN BOOLEAN MODE)
    OR foo LIKE '%myQuery%'
  );

I got some extra results, which do not appear in any of the first two requests, and didn't contain myQuery at all.

Is there any trick with the parenthesis that I missed?
Or could it be related to any sort of index cache? Sometimes the results are correct, and suddenly, there are not anymore.

I also tried with

WHERE (
  groupId = '1' AND MATCH (foo) AGAINST ('"myQuery"' IN BOOLEAN MODE)
) OR (
  groupId = '1' AND foo like '%myQuery%'
);

Edit: here are the results of my requests, with myQuery = 'gold'.

The 1st and 2nd one returns:

'Fancy gold'
'Nice gold'
'Super Nice gold'
'Ugly gold'

The last one returns:

'Cornices diamond'
'Custom'
'Fancy gold'
'Nice gold'
'Super Nice gold'
'Ugly gold'

One other thing I noticed, I ran Optimize table bar, and then results are correct. I ran again 1st request again, and then the 3rd results are not correct anymore. So I really suspect something related to the full-text index.

Edit 2: Here is a dbFiddle : https://www.db-fiddle.com/f/iSXdTK7EzfoQ46RgDX7wF3/1

Yves M.
  • 29,855
  • 23
  • 108
  • 144
egoz
  • 354
  • 3
  • 16
  • 2
    https://dbfiddle.uk/?rdbms=mysql_8.0 create here db table and with data show us your problem – Zaynul Abadin Tuhin Sep 10 '19 at 07:28
  • Same as you, I always end up with strange result using `OR`. Personally, I won't use it, I rather use `UNION` or `UNION ALL`. I try very much avoid using `OR`. – FanoFN Sep 10 '19 at 07:31
  • 4
    @tcadidot0 `OR` works perfectly fine in mySql, don't blame a tool if you can't use it properly, and please don't try to persuade people to use other tools that are not the same thing at all – Kaddath Sep 10 '19 at 07:39
  • Pardon me @kaddath but if the OP have issue using the tool especially when OP said _Sometimes the results are correct, and suddenly, there are not anymore._ why wouldn't OP try a different tool instead? Besides, I specifically mention it as my personal opinion and suggest OP to try a different way where the result is much certain and not _Sometimes the results are correct, and suddenly, there are not anymore._ – FanoFN Sep 10 '19 at 07:47
  • @tcadidot0 Using `UNION` returns correct values yes. But instead of blindly use it, I would like to understand why is the `OR` one is so weird. Furthermore, the `UNION` version is more expensive – egoz Sep 10 '19 at 07:51
  • 2
    @tcadidot0 Well in almost 30 years of coding I've never seen a language feature that behaves strange per se. It's either the feature that's buggy (and you can know it), a rule that you don't know or understand but most likely it's your code or data that's strange. If you come up with strange results with `OR` in mySql, why not ask a question like the OP? For the question here, examples of data and unexpected results would be welcome – Kaddath Sep 10 '19 at 08:00
  • @Kaddath , I respect that and I admit I don't have many experience in using `OR`. I didn't asks about the strange results I got while using `OR` because I've search in SO in find a question discussing about the difference between `OR` and `UNION`. The accepted answer said that `OR` is equivalent to `IN` and when I applied `IN` it solved my issue. – FanoFN Sep 10 '19 at 08:10
  • @Kaddath, I've edtited my question with results examples – egoz Sep 10 '19 at 08:14
  • Your previous (similar) question happened on aws. Were you able to reproduce this behaviour on MySQL (e.g. on dbfiddle), or is this still happening only on aws? If so, you should specify (and tag) it, as it might be an aws bug/aws specific, unreproducible on plain mysql. – Solarflare Sep 10 '19 at 08:56
  • @Solarflare, I'm currently testing this on a local MySql 5.7 – egoz Sep 10 '19 at 09:16
  • 1
    First, I'd check if you committed all data (fulltext indexes work with committed data, although I'm not sure how I'd reproduce what you are describing). Can you create a fiddle (see 1st comment, or sqlfiddle.com) with the exact code you run? (Create table, insert/updates, queries, optimize, queries). Even if it doesn't show the effect, we can then just copy&paste it and test in on a local installation (specify your exact version). If it happens for you on 2 independent installations (aws, local), there seem to be a subtlety that you/we are missing which we might spot in your concrete code. – Solarflare Sep 12 '19 at 06:15
  • @egoz: would you be able to provide sample data that can help reproduce the issue? Without sample data, we are kind of blind in diagnosing what could be wrong. – GMB Sep 16 '19 at 22:04
  • Please provide `SHOW CREATE TABLE`. And sample rows, if practical. – Rick James Sep 17 '19 at 00:45
  • Can you please try to reproduce the problem with https://dbfiddle.uk/?rdbms=mysql_8.0 ? – Yves M. Sep 17 '19 at 07:25
  • 1
    I just added a dbFiddle so you can test the behaviour – egoz Sep 17 '19 at 09:28
  • @egoz this behaviour is not reproducing in MySQL 5.7 and above. FTS was introduced for first time in InnoDB in version 5.6 You possibly have encountered a bug in early introductory versions. Check the fiddle for 5.7, where it is not reproducing: https://www.db-fiddle.com/f/hivf5CnJnbbGtXB8oWLhLN/8 So, one advice: **UPGRADE** – Madhur Bhaiya Sep 17 '19 at 11:01
  • @tcadidot0 btw this bug is in calculation of [TF-iDF](https://dev.mysql.com/doc/internals/en/full-text-search.html) in introductory versions of FTS in InnoDB. MySQL 5.6 is calculating a non-zero value for even non-matching rows. This has nothing to do with `OR` clause. Check the result of third query in this fiddle: https://www.db-fiddle.com/f/hivf5CnJnbbGtXB8oWLhLN/9 – Madhur Bhaiya Sep 17 '19 at 11:05
  • Yes @MadhurBhaiya thanks. It's not my concern at all because I don't think I'm gonna use `OR`, at least not now but I'll appreciate the information and I will test it on our working servers just to make sure. We're not using MySQL 5.xx in any of our servers at the moment but I'm curious to know if this bug appear in MariaDB, hopefully not. – FanoFN Sep 18 '19 at 02:14
  • File a bug with bugs.mysql.com . Note that it fails with 8.0, so upgrading does not seem to be useful. – Rick James Sep 18 '19 at 05:52
  • @tcadidot0 - There is a big difference in the implementation between `OR` and `UNION`. So switching to `UNION` might be the best workaround. – Rick James Sep 18 '19 at 05:57
  • Wait, I'm confused @RickJames . Did you mean to type _"might be the best workaround"_ or _"might *not* be the best workaround"_? – FanoFN Sep 18 '19 at 06:46
  • @tcadidot0 - I have not thought of a better workaround. – Rick James Sep 18 '19 at 14:23
  • I realize that this question is not about how to get the correct result. Its more to "Why is this happening when I use `OR`?" So basically the answer is the MySQL version has a bug. If like that, there's no other solution than to use a workaround specifically for this MySQL version. Another solution without a workaround is to upgrade as suggested and tested by Madhur Bhaiya but as Rick James point out that this behavior also present in 8.0, it restricted the upgrade solution to maybe only v5.7. – FanoFN Sep 20 '19 at 01:32
  • It seems that once the `MATCH` succeeds, the `LIKE` will then match to any string _if the LIKE uses the same column_. – Turophile Sep 20 '19 at 03:22

3 Answers3

1

A self-join seems to get around the problem by tricking the db into thinking that it is not the same name column in the two predicates:

SELECT 
   t1.name 
FROM
    m as t1
    INNER JOIN m as t2 ON t1.id = t2.id 
WHERE
    t1.sId = 'N'
    AND (
      MATCH (t1.`name`) AGAINST ('"foo"' IN BOOLEAN MODE)
      OR t2.`name` LIKE '%foo%'
    );

As I mentioned in a comment above, in the original SQL, it seems that if the MATCH gives a TRUE result for any row, the LIKE will match to anything. This works around the problem by treating the two name columns as different even though they are from the same table.

EDIT: Interestingly, a correlated sub-select doesn't have the same beneficial effect:

SELECT 
   t1.name 
FROM
    m as t1
WHERE
    t1.sId = 'N'
    AND (
      MATCH (t1.`name`) AGAINST ('"foo"' IN BOOLEAN MODE)
      OR 1 = (SELECT 1 
              FROM m as t2 
              WHERE t1.id = t2.id 
              AND t2.`name` LIKE '%foo%' ) 
    );
Turophile
  • 3,367
  • 1
  • 13
  • 21
0

I have a very strong feeling that what you described is what you would want for the queries to look like, but that you are actually using code to compose the queries and you have a bug in your code that makes the myQuery null... then you are matching against '%%' which always matches.

When you invoke a bug, you should post the actual queries, not some wishful interpretation of what the queries might look like when transcribed to actually fit the purpose.

Are you maybe using ? ? placeholders and you forgot to set a second value equal to the 1st when data binding?

Dinu
  • 1,374
  • 8
  • 21
0

This seems give the result that you want

SELECT 
   name
FROM
    m
WHERE
    sId = 'N'
    AND MATCH (`name`) AGAINST ('"foo"' IN BOOLEAN MODE)
    OR `name` LIKE '%foo%';
Erwin
  • 460
  • 2
  • 6