1

i use from "FullText" in "MySql" Because there are non-related words (unknown) in the text

table name: find

records:

id      |     name
----------------------------
1             help me
2             helply goods
3             this help good every
4             this easy helps goods
5             this help good

i try this code:

SELECT * FROM `Find`
WHERE MATCH(`name`) AGAINST('internet* help* this* good* for* alll*' IN BOOLEAN MODE)

"IN BOOLEAN MODE" show all records that have one of these words.

I want search method like the "IN NATURAL LANGUAGE MODE" to show the best result. But in this way, the correct word should be written that is not good

So I have to use the IN BOOLEAN MODE method

output of this query is :

id      |     name
----------------------------
2             helply goods
3             this help good every
4             this easy helps goods
5             this help good

but i want run a query that sorted by "best results" example find : internet* help* this* good* for* alll*

output:

id      |     name
----------------------------
5     |     this help good
3     |     this help good every
4     |     this easy helps goods
2     |     helply goods
1     |     help me
AmirHadi
  • 53
  • 8
  • You have to explicitly add the order: `order by MATCH(`name`) AGAINST('internet* help* this* good* for* alll*' IN BOOLEAN MODE)`. The natural language search does it implicitly. – Solarflare Aug 27 '17 at 07:29
  • I've already tried it, but it's not showing up right now Field 1 (help me) is not displayed, for example – AmirHadi Aug 27 '17 at 07:39
  • Well, that is a different question than the order. `help` is a [stopword](https://dev.mysql.com/doc/refman/5.7/en/fulltext-stopwords.html) in `myisam`, so it will not be included in the index (so you cannot find it). `me` is probably shorter than your [minimal word length](https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_ft_min_word_len), so it will not be included in the index. – Solarflare Aug 27 '17 at 07:58
  • I set the value of the field "ft_min_word_len" to 1, but again the results are not like "IN NATURAL LANGUAGE MODE" mode. – AmirHadi Aug 27 '17 at 08:02
  • You are mixing topics. The natural and the boolean search will use the same index, so if words like "help" or "me" are not in it, you can't find it with neither mode. So I'm not really sure what you mean by "again the results are not like IN NATURAL LANGUAGE MODE"). If you want the same result, use the same mode. One detail: the myisam natural mode has a 50% threshold (read the last part in the [manual](https://dev.mysql.com/doc/refman/5.7/en/fulltext-natural-language.html), "For very small tables," ). Could you check IF you are using MyISAM? (as most of your words are in the stopword list). – Solarflare Aug 27 '17 at 08:17
  • In the "IN NATURAL LANGUAGE MODE" method, the * character is not conceptual. Is there a way to use * in this method? – AmirHadi Aug 27 '17 at 08:48
  • No, it is not. You have to use the boolean mode for this. I just do not understand what you are looking for. The results will be different in two different modes. Especially the order, which is what you asked for (I forgot the `order by ... desc` to make it place the best result on top, maybe that's what is bothering you). It will not be the exact same order or the exact same rows. Also, PLEASE check if you are using MyISAM. Literally every word you use in you search is a MyISAM stopword, so that example would not be usable to investigate the fulltext behaviour in MyISAM. – Solarflare Aug 27 '17 at 09:11
  • The texts that are to be searched are not specific and users enter it, so the results should be based on the user's search. And maybe search for non-related text that does not exist in the database, and the closest results to the user should not be given. Yes I use MyISAM – AmirHadi Aug 27 '17 at 09:35
  • Please check the link I posted about stopwords. `this`, `help` and `every` are in it, so they are not included in the index. They will have no effect on your search results (that is e.g. why `help me` is not found, as `help` cannot be found and you didn't search for `me`). So if you try to investigate how the fulltext search works, these are not good examples. Either disable the list (`set ft_stopword_file = '';` and rebuild the index) or use other words, otherwise you will have a hard time understanding the result (and I will have a hard time understanding what your problem is). – Solarflare Aug 27 '17 at 10:39
  • Yes, in this example you are correct, but if you look at the example, if we test other words that are not in the "stopwords" list, then the results are not based on the best result. – AmirHadi Aug 27 '17 at 13:43
  • I can't see anything in the example. It's obviously not ordered, because you didn't order it. So add a result *with* `order by MATCH(name) AGAINST('internet* help* this* good* for* alll*' IN BOOLEAN MODE) desc`. It will order the result by the number of hits (which differs from the natural mode that included the length of the sentence in the formula). It will ignore `this`, `help` and `every` for this. That's how the boolean mode fulltext search works. You may want to try innodb, it orders by a different algorithm. If you don't like either, you can always calculate a weight/order on your own. – Solarflare Aug 27 '17 at 15:04
  • how to calculate a weight/order? – AmirHadi Aug 27 '17 at 15:14
  • Just write a formula for a weight and order by it. E.g. create a function myweight and then use `order by myweight('internet* help* this* good* for* alll*', name) desc`. In that function you can do what you want (e.g. count words, give more points if `internet` is in it than if `internetservice` is in it, ...). It will totally depend on what you think the perfect order should look like (and if you can express this in a formula). You could also do something like `order by match (...) against (...) desc, length(name)` to prefer shorter text. As I said, it depends on what you consider "best". – Solarflare Aug 27 '17 at 15:29
  • Thank you very much for the description, please explain the function of this function and give an example – AmirHadi Aug 27 '17 at 20:10
  • Stackoverflow is not a codewriting service, so please try to write it yourself (look e.g. [here](https://stackoverflow.com/q/6496866/6248528) or [here](https://stackoverflow.com/q/5322917/6248528) or google.com for inspiration), post what you did and where you get intro trouble. Such a function will be annoying to write though, I would not recommend it; you have to do a lot of string processing, and it's hard to get better results than now or with innodb or with a simple division by textlength. You might also want to check out other search engines like Solr/elasticsearch/lucene instead. – Solarflare Aug 27 '17 at 20:41
  • For example, in the database, the word "ba to" is saved, I want to find the record if the "bato" is searched – AmirHadi Aug 28 '17 at 09:35
  • That's a different question (so you should post a new one), but anyway: you can do this by splitting the input string into every combination, e.g. `against("b ato" "ba to" "bat o")`. I'd suggest you make a list of requirements, e.g. what input should produce which output (bato finds ba to, but not batom), text type (russian literature, product codes), the desired order, rowcount,... Then do some research (e.g. read the fulltext manual, read about solr/lucene/elasticsearch, test some things). Then post this on softwarerecs.stackexchange.com ("what should i use" does not fit on stackoverflow). – Solarflare Aug 28 '17 at 11:08
  • Sorry, the word splitting method is good, but the search result is not exact as usual Is there a way to weight the letters? Because the value of some words is equal to me, for example, the letters "gh" and the letter "q": and the word "eshgh" is equal to the word "eshq" in my search. Unfortunately, since my question is stereotyped, I can not post in a post – AmirHadi Aug 28 '17 at 14:03
  • If you want to weight in a different way than the fulltext search does it, you will need to write your own function. If you want to treat gh and q equal, you can replace gh by q in your text and any search input, or for an input of gh, search for both gh and q. But this is a bottomless pit. For every idea you find two new requirements you want to be treated in a special way. So again: write a list with all your requirements and ask for a software recommendation that can do it on the partnersite that is, in contrast to stackoverflow, there to recommend software: softwarerecs.stackexchange.com. – Solarflare Aug 28 '17 at 15:34

0 Answers0