I'm writing some very simple search functionality for a list of FAQ's. I'm splitting the search string on a variety of characters, including spaces. Then performing a select along the lines of
SELECT *
FROM "faq"
WHERE
((LOWER("Question") LIKE '%what%'
OR LOWER("Question") LIKE '%is%'
OR LOWER("Question") LIKE '%a%'
OR LOWER("Question") LIKE '%duck%'))
I've had to edit this slightly as its generated by our data access layer but it should give you an idea of whats going on.
The problem is demonstrated well with the above query in that most questions are likely to have the words a or is in them, however I can not filter these out as acronyms may well be important for the searcher. What has been suggested is we order by the number of matching keywords. However I have been unable to find a way of doing this in SQL (we do not have time to create a simple search engine with an index of keywords etc). Does anyone know if there's a way of counting the number of LIKE matches in an SQL statement and ordering by that so that the questions with the most keywords appear at the top of the results?