1

I have a search query that is able to sort results by relevance according to how many of the words from the query actually show up.

SELECT id, 
       thesis 
FROM   activity p 
WHERE  p.discriminator = 'opinion' 
       AND ( thesis LIKE '%gun%' 
              OR thesis LIKE '%crucial%' ) 
ORDER  BY ( ( CASE 
                WHEN thesis LIKE '%gun%' THEN 1 
                ELSE 0 
              end ) 
          + ( CASE 
                WHEN thesis LIKE '%crucial%' THEN 1 
                ELSE 0 
              end ) ) 
           DESC 

This query however, does not sort according to how many times 'gun' or 'crucial' show up. I want to make it so records with more occurrences of 'gun' show up above records with less occurrences. (I.E, add a point for every time gun shows up rather than adding a point because gun shows up at least once)

Shiplu Mokaddim
  • 56,364
  • 17
  • 141
  • 187
NightRaven
  • 401
  • 3
  • 17

2 Answers2

2

I might be wrong but without use of stored procedures or UDF You won't be able to count string occurrences. Here's sample stored function that counts substrings:

drop function if exists str_count;

delimiter |
create function str_count(sub varchar(255), str varchar(255)) RETURNS INTEGER
DETERMINISTIC NO SQL
BEGIN
    DECLARE count INT;
    DECLARE cur INT;

    SET count = 0;
    SET cur = 0;

    REPEAT
        SET cur = LOCATE(sub, str, cur+1);
        SET count = count + (cur > 0);
    UNTIL (cur = 0)
    END REPEAT;

    RETURN(count);
END|

You might want to change varchar(255) to varchar(65536) or TEXT. You can now use it in order by query:

SELECT id, 
       thesis 
FROM   activity p 
WHERE  p.discriminator = 'opinion' 
       AND ( thesis LIKE '%gun%' 
              OR thesis LIKE '%crucial%' ) 
ORDER  BY STR_COUNT('gun',thesis) + STR_COUNT('crucial', thesis)

If Your dataset is large and performance is important for You I suggest to write custom UDF in C.

matt
  • 4,614
  • 1
  • 29
  • 32
1

Depending on how your database is set up, you may find MySQL's full text indexing to be a better fit for your use case. It allows you to index fields and search for words in them, ordering the results by relevance related to the number of occurrences.

See the documentation here: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

This is a useful question that gives some examples, and may help: How can I manipulate MySQL fulltext search relevance to make one field more 'valuable' than another?

Finally, if full text searches aren't an option for you, the comment posted by Andrew Hanna on the string functions reference may do the trick: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html (search the page for "Andrew Hanna"). They create a function on the server which can count the number of times a string occurs.

Hope this helps.

Community
  • 1
  • 1
cgwyllie
  • 413
  • 3
  • 10