I have a table which has two columns: ID
(primary key, auto increment) and keyword
(text, full-text index).
The values entered in the keyword
column include the following:
keyword
- Car
- Car sales
- Cars
- Sports cars
- Sports foo
- Car bar
- Statistics
Suppose that we have this sentence as an input:
"Find sports car sales statistics in Manhattan."
I'm looking (and I have been searching for quite a while) to find either a MySQL query or an algorithm which takes in the given input, and detects the keywords used from the keywords
column, resulting in an output of:
"Sports cars", "Car sales", "Statistics"
In other words, I'm trying to take an input that's in the form of a sentence, and then match all the existing (and most relevant) keyword values in the database that are found in the sentence. Note that these keywords could be phrases that consist of words separated by a space.
After researching I got to know that MySQL does a similar job through its full-text search feature. I have tried all the natural language, boolean, and query expansion options, but they include keyword records that only have half of its contents matching with the input. For example, it outputs:
"Car", "Car sales", "Sports cars", "Sports foo", "Cars bar", "Statistics".
I don't want this to happen because it includes words that aren't even in the input (i.e. foo and bar).
Here's the MySQL query for the above mentioned search:
SELECT * FROM tags WHERE MATCH(keyword) AGAINST('Find sports car sales statistics in Manhattan.' IN BOOLEAN MODE)
I also tried to improve on the relevancy, but this one only returns a single record:
SELECT *, SUM(MATCH(keyword) AGAINST('Find sports car sales statistics in Manhattan.' IN BOOLEAN MODE)) as score FROM tags WHERE MATCH(keyword) AGAINST('Find sports car sales statistics in Manhattan.' IN BOOLEAN MODE) ORDER BY score DESC