6

So I have been reading the below:

How can I manipulate MySQL fulltext search relevance to make one field more 'valuable' than another?

I was interested in the below response and tried to implement this with success.

SELECT url, keywords, title,
  MATCH (keywords) AGAINST ('green watermelon') AS rel1,
  MATCH (title) AGAINST ('green watermelon') AS rel2
FROM straight
WHERE MATCH (keywords,title) AGAINST ('green watermelon')
ORDER BY (rel1)+(rel2*1.5)

My only problem is the following.

If a user is to search for say "Green watermelon" which matches 3 rows in a table with 2 columns as below:

+------------+------------+
| Keyword    | Title      |
+------------+------------+
| Green      | Green      |
| Green      | Watermelon |
| Watermelon | Watermelon |
+------------+------------+

I would like the second record to be "ranking" first as it is most relevant, but they all have the same relevancy for the search term "Green Watermelon" due to "Green" being used twice within Keyword and Title.

The best way I think this can be solved is if it finds the word green once it increases the relevancy by say 1 and any other times it sees the word green it increases it by 0 then if it sees watermelon it does the same. This would give the second result a relevancy score of 2 and the other 2 rows a relevancy of 1. Therefore making it more relevant as it should be. This will then work for longer search terms when searching in a larger database.

EDIT:

If possible I could create another column that has both the title and the keywords in, so say the column is called "mashup" the table will now look more like:

+------------+------------+-----------------------+
| Keyword    | Title      | Mashup                |
+------------+------------+-----------------------+
| Green      | Green      | Green Green           |
| Green      | Watermelon | Green Watermelon      |
| Watermelon | Watermelon | Watermelon Watermelon |
+------------+------------+-----------------------+

Then "if possible" we would need to de dupe the fields so they end up like:

+------------+------------+-----------------------+
| Keyword    | Title      | Mashup                |
+------------+------------+-----------------------+
| Green      | Green      | Green                 |
| Green      | Watermelon | Green Watermelon      |
| Watermelon | Watermelon | Watermelon            |
+------------+------------+-----------------------+

Then the search will work as intended with perfect relevancy scores.

Thanks in advance

Community
  • 1
  • 1
BubblewrapBeast
  • 1,507
  • 2
  • 15
  • 19
  • 1
    Can you setup a sqlfiddle? – Mihai Jan 14 '15 at 14:42
  • @Mihai Sorry I have never set one up before, will this enable you to have a look is that what you are after? – BubblewrapBeast Jan 14 '15 at 16:24
  • 1
    yeah,I want to play with it a bit,if I knew exactly what was what I would have given an answer.Just a table with a few rows. – Mihai Jan 14 '15 at 16:34
  • @Mihai for some reason when I select build schema it just freezes. If you try and create a table with just two columns and imput 3 rows like I did above, then full text the keywords column then the title column then both columns you will see what I mean. Thanks – BubblewrapBeast Jan 14 '15 at 17:05
  • 1
    I\`ll do it on my machine because I`m also curious but no promises – Mihai Jan 14 '15 at 17:09
  • Because the problem is, if I have a larger database and I am looking specifically for "Green Watermelon" and a field has Green in both or even just in title, it has the same relevancy score to a row that has Green watermelon in title or even green in title and watermelon in keywords. So it makes it slightly pointless even having a relevancy score. – BubblewrapBeast Jan 14 '15 at 17:33
  • Sadly mysql has no build in features for this, the fulltext search is very limited. So you can either create plain data as you already described or split the search - for example match each word for its own an then calculate the total score as you want out of those single scores. Or you could use a external search engine like "lucene" but that might be overkill if this is your only fulltext problem. – Tarsis Feb 04 '15 at 10:35
  • If it is enough to rank a full hit on top you could also use a second Match with booleanmode and "+" option (MATCH (+search1 +search2) AGAINST.. and add its score to the original score. – Tarsis Feb 04 '15 at 10:43

1 Answers1

0

You only need to specify the more important column in the first position of your MATCH()

 SELECT url, keywords, title
      FROM straight
      WHERE MATCH(title, keywords) AGAINST('green watermelon')
      ORDER BY MATCH(title, keywords) AGAINST('green watermelon') DESC;

+------+------------+------------+
| id   | keywords   | title      |
+------+------------+------------+
| 2    | Green      | Watermelon |
| 3    | Watermelon | Green      |
| 1    | Green      | Green      |
+------+------------+------------+
Adam
  • 17,838
  • 32
  • 54