1

I have a table of product names, and full text works great up until using a period, e.g. searching for a 3.7 battery.. i try

select .. where match(name) against ('+3.7v +battery' in boolean mode)

i saw the suggestion to enclose the period in double quotes, i.e. against ('+\"3.7v\"' in boolean mode) or against ('\"3.7v\"' in boolean mode)

Neither return any results. I looked at the default list of stop words (https://dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html) and don't see period specified.

I looked at http://dev.mysql.com/doc/refman/5.6/en/fulltext-fine-tuning.html and saw ft_min_word_len is 3, which this should fit under..

Does anybody have any suggestions?

Edit: /dev.mysql.com/doc/refman/5.6/en/fulltext-natural-language.html says The FULLTEXT parser determines where words start and end by looking for certain delimiter characters; for example, “ ” (space), “,” (comma), and “.” (period).

I guess these delimiter characters are different from the list of stop words. Is there any way to configure these without modifying the source code? Unfortunately I am using InnoDB which full-text functionality was only recently added for, and full-text parser plug-ins can only be created for MyISAM tables in 5.6

2 Answers2

1
where match(name) against ('+battery' in boolean mode)
  AND name LIKE '%3.7v%'

This requires some intelligence in your app when constructing the query.

Other variants:

This makes sure the 3.7v is next to battery:

where match(name) against ('+battery' in boolean mode)
  AND name LIKE '%3.7v battery%'

This makes sure it has word boundaries around it:

where match(name) against ('+battery' in boolean mode)
  AND name REGEXP '[[:<:]]3.7v[[:>:]]'
Rick James
  • 135,179
  • 13
  • 127
  • 222
0

As documented under Fine-Tuning MySQL Full-Text Search:

You can change the set of characters that are considered word characters in several ways, as described in the following list. After making the modification, rebuild the indexes for each table that contains any FULLTEXT indexes. Suppose that you want to treat the hyphen character ('-') as a word character. Use one of these methods:

  • Modify the MySQL source: In storage/myisam/ftdefs.h, see the true_word_char() and misc_word_char() macros. Add '-' to one of those macros and recompile MySQL.

  • Modify a character set file: This requires no recompilation. The true_word_char() macro uses a “character type” table to distinguish letters and numbers from other characters. . You can edit the contents of the <ctype><map> array in one of the character set XML files to specify that '-' is a “letter.” Then use the given character set for your FULLTEXT indexes. For information about the <ctype><map> array format, see Section 10.3.1, “Character Definition Arrays”.

  • Add a new collation for the character set used by the indexed columns, and alter the columns to use that collation. For general information about adding collations, see Section 10.4, “Adding a Collation to a Character Set”. For an example specific to full-text indexing, see Section 12.9.7, “Adding a Collation for Full-Text Indexing”.

eggyal
  • 122,705
  • 18
  • 212
  • 237
  • 1
    I can't believe I missed that! However, note the last two "easy" options can only be used for non-utf8 collations.. since that uses a built-in charset. So for any time travellers reading this, see the third answer on http://stackoverflow.com/questions/401771/how-can-i-find-non-ascii-characters-in-mysql for determining which charset to use – user3296206 Feb 11 '14 at 08:34