2

I have multiple conditions in a query like this:

SELECT * FROM image WHERE name LIKE '%text%' AND group_id = 10 LIMIT 1

The WHERE statements consist of 3 conditions:

  • Text match
  • Match of a foreign key

What if I want to sort the result by relevance, so depending on:

  1. How exact the text is matched
  2. How much conditions are met at all (e.g. the text match and the foreign key)

This is two questions in one but I think some times these will be in handy in combination. By this I'm referring to a question arising from a former post of mine (Way to try multiple SELECTs till a result is available?).

Thanks in advance!

Community
  • 1
  • 1
nepa
  • 1,421
  • 2
  • 18
  • 28

1 Answers1

7

To know how exactly the text is matched, you need the fuzzystrmatch PostgreSQL module. It provides functions as difference and levenshtein that will give you an estimation of the similarity between two strings.

Then, you can build your query with some conditions like:

    SELECT *
      FROM image
     WHERE name LIKE '%text%' 
       AND ( group_id = 10 
             OR second_field = 4
             OR thirth_field = 5
           )
    ORDER BY ( (group_id=10)::int
               + (second_field=4)::int
               + (thirth_field=5)::int
             ) * weight_1
             + (strlen(name) - levenshtein('text',name)) * weight_2

You can adjust weight_1 and weight_2 to give preference to the text distance or the number of conditions met.

Christian
  • 27,509
  • 17
  • 111
  • 155
Oscar Pérez
  • 4,377
  • 1
  • 17
  • 36
  • 3
    +1 But to avoid what I consider the `case` verbosity I would sum the booleans itseft like `(group_id=10)::int + ...` – Clodoaldo Neto Feb 05 '13 at 13:59
  • True, it will be more concise with this kind of expression – Oscar Pérez Feb 05 '13 at 14:18
  • Exactly what I needed, thanks @oscar-perez! Maybe it would be good to edit your answer so it also includes the suggestion of @Clodoaldo – nepa Feb 05 '13 at 17:39
  • Is it also possible to add the calculated relevance to the resultset in an easy way (without repeating the expression)? – nepa Feb 05 '13 at 17:41
  • 1
    @nepa: I'm not sure, but maybe you could move the relevance calculation to just after the * (something like SELECT (...), IMAGE.* FROM...) and use a GROUP BY 1 – Oscar Pérez Feb 05 '13 at 19:25