2

How efficient is mySql full text search for non English languages.

I am starting a project and had chosen postgres due to its great support to full text search in multiple languages. In postgres I can specify the language in the full text search to get the best out of it.

Does mySql have something similar? Anyone using the mySql full text search in non English languages?

Roger that
  • 442
  • 6
  • 21
  • 2
    I know there are stop word that are ignored in full text search, and it includes a default common words like "as", "the", "is" and so on. You can define those stop words in your native language as described here https://dev.mysql.com/doc/refman/5.7/en/fulltext-stopwords.html - Not sure if there are other adjustments need to be made, but I can tell from my experience it works well with the Hebrew language – Alon Eitan Nov 11 '17 at 09:59
  • Thanks Alon for your feedback, I hope to hear feedback from people using it in other languages as well. – Roger that Nov 11 '17 at 10:25
  • 1
    Possible duplicate of [Does MySql full text search works reasonably with non-Latin languages (Hebrew, Arabic, Japanese...)](https://stackoverflow.com/questions/1354142/does-mysql-full-text-search-works-reasonably-with-non-latin-languages-hebrew-a) – Alon Eitan Nov 11 '17 at 11:10
  • 1
    I found similar question, they say it generally works with other languages, and one answer also say to check those stopwords for your specific non-English language – Alon Eitan Nov 11 '17 at 11:11

1 Answers1

0

It does NOT work well with Chinese or Thai languages as of version 8.

The built-in MySQL full-text parser uses the whitespace between words as delimiters. Some languages like Chinese, Japanese, Korean, Thai, Khmer use writing systems that don't commonly use whitespace between individual words.

MySQL has a MeCab Parser to address the problem. But, I haven't tried it.

emanresu
  • 141
  • 2
  • 5