5

I am just doing a bunch of testing with mysql's full text boolean mode and from my testing it doesn't seem you can use the minus sign on multiple words?

I have two rows for example..

id,name
1,2011-12 Fleer Retro auto jordan non
2,1999 jordan non auto

If I run the following query:

SELECT auction_id,`name`,description FROM auctions WHERE MATCH(`name`) AGAINST('+jordan +auto -non' IN BOOLEAN MODE);

Both rows don't show as expected. However if I run this query:

SELECT auction_id,`name`,description FROM auctions WHERE MATCH(`name`) AGAINST('+jordan +auto -"non auto"' IN BOOLEAN MODE);

Both rows don't show (same results). Shouldn't row 1 come back?

Edit: My ft_min_word_len is set to 2 and I have disabled my stop words file, so it has nothing to do with that.

Fiddle: http://sqlfiddle.com/#!2/d1987/4

However, seems fiddle uses the default stop words file and hence testing it out with the "non" word doesn't work.

Brett
  • 19,449
  • 54
  • 157
  • 290
  • Try adding a few more rows to the table. – Strawberry Jun 10 '13 at 13:01
  • @Strawberry sorry, the table does have more than 2 rows...... it actually has `164` rows in total. – Brett Jun 10 '13 at 13:04
  • And approximately what percentage match the criteria you provide? – Strawberry Jun 10 '13 at 13:08
  • @Strawberry `5` without the minus sign and `3` when using `-"non-auto"`; but it should be 4. – Brett Jun 10 '13 at 13:11
  • 1
    Care to provide a sqlfiddle? – Strawberry Jun 10 '13 at 13:32
  • @Strawberry Ok, here we go http://sqlfiddle.com/#!2/f143c - I tried to insert some rows but it was incredibly whiny about inserting too large statements. I can make up a smaller fiddle with just the required rows if that's better. – Brett Jun 10 '13 at 14:56
  • Brett. Yes make a smaller fiddle with just the relevant info, and edit your original post to include the full sqlfiddle address - AND show us the desired result set!!! – Strawberry Jun 10 '13 at 15:21
  • @Strawberry How many results do you want loaded? Just the 5 matching ones, the whole table or about 20 or so to avoid the 50% issue? – Brett Jun 10 '13 at 15:30
  • @Strawberry I have updated the question with the new fiddle and some other details. – Brett Jun 10 '13 at 15:49
  • I *think* the problem is that your keyword is too short! – Strawberry Jun 10 '13 at 16:01
  • @Strawberry Naaa, I have disabled the stop words file and have the `ft_min_word_len` set to `2`. – Brett Jun 10 '13 at 16:10
  • Hmm, however that IS why the example in the fiddle won't work http://sqlfiddle.com/#!2/9cd5c/1 See? – Strawberry Jun 10 '13 at 16:17
  • @Strawberry Yeah I know, too bad we can't control those aspects on fiddle huh!? – Brett Jun 10 '13 at 16:43
  • To your SqlFiddle: `SELECT auction_id,`name` FROM auctions WHERE MATCH(`name`) AGAINST(' +auto +jordan -"jordan auto"' IN BOOLEAN MODE);` - this returns nothing, although I would expect to get `2011-12 Fleer Retro auto jordan non` back. – Stoleg Jun 24 '13 at 14:11
  • It looks like a bug to me. – Stoleg Jun 24 '13 at 14:12
  • @Brett You can log a bug (I do not work with MySQL much) on bug tracker. This should close this question. – Stoleg Jun 26 '13 at 20:20

2 Answers2

0

The reason why row 1 does not come back is that negative action (like exclude) overtakes positive action (like include). It common practice in security for example, where DENIED permission has priority over ALLOW or GRANT permission action.

From MySQL 12.9.2. Boolean Full-Text Searches:

Note: The - operator acts only to exclude rows that are otherwise matched by other search terms. Thus, a boolean-mode search that contains only terms preceded by - returns an empty result. It does not return “all rows except those containing any of the excluded terms.”

Hence any query like:

 ... AGAINST('+Any_string -"any_string"' IN BOOLEAN MODE)

will yield nothing.

UPDATE

-"non auto" blocks "auto" from appearing in search results, because non is a stopword and is excluded from search string. Another reason for this word to be excluded from search in BOOLEAN MODE is that it is too short:

If the phrase contains no words that are in the index, the result is empty. For example, if all words are either stopwords or shorter than the minimum length of indexed words, the result is empty.

UPDATE 2

I would stick to my explaination above. Although it is not something yuo would expect. It looks as double quotes with minus sign as in -"term1 term2" are interpreted as () - parentheses, not doubble qoutes.

This query returns nothing, although I expect to see rows like 2011-12 Fleer Retro auto jordan non and 1999 jordan non auto. It has nothing to do with stopwords.

SELECT auction_id,`name` FROM auctions 
WHERE MATCH(`name`) AGAINST('+jordan +auto -"jordan auto"' IN BOOLEAN MODE);

Also there is a related bug #36384: Full-Text required (+) operator bug. It supports my hypothesis, that parsing of full-text search expressions may not work as expected.

Stoleg
  • 8,972
  • 1
  • 21
  • 28
  • But I didn't block `auto`; I included the term in double quotes so it should only block the full term of `non auto`. – Brett Jun 17 '13 at 11:11
  • @Brett `non` is either too short, is a stopword or both. – Stoleg Jun 17 '13 at 11:53
  • My `ft_min_word_len` is set to `2` and I have disabled my stop word file so that shouldn't have any affect on the results. – Brett Jun 19 '13 at 08:11
  • @Bret Have you restarted the server and, what's more important, rebuild your full text index? When index is built with stopword list on, stopword are not included in the index. So when you disable stopword list (by specifiing empty string in `ft_stopword_file` system variable), you need to rebuild full text index, so former stopwords appear in the index. – Stoleg Jun 19 '13 at 08:37
0

I hate to say it but you'll have to use LIKE. Below I've included a query that will work the way you want it to

SELECT auction_id,`name` FROM auctions WHERE MATCH(`name`) AGAINST('+jordan' IN BOOLEAN
MODE) AND `name` NOT LIKE('%non auto%') ;

The problem with using full-text mode is that according to MySQL's docs:

Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase" in MySQL 5.0.3, but not before.

This is why you're running into trouble. Hope this helps. EDIT: As for why it behaves this way exactly (excluding things that contain auto and non regardless of where they are relative to each other) I have no clue, but it doesn't seem like there's much of a way to override this default behavior.

Jamie E
  • 100
  • 1
  • 7
  • Hmmmm...... not really that important to me that I would go to using `LIKE` and would be a pain to do dynamically. – Brett Jun 24 '13 at 14:46
  • It wouldn't be that much of a pain. You just get whatever string from the user, or elsewhere in the code, and surround it with % % in the like condition, and you're set. But I don't know how you're getting the input for a search so maybe it wouldn't work. – Jamie E Jun 24 '13 at 16:50