0

My earlier question was resolved. Now I need to develop a related, but more complex query.

I have a table like this:

id     description          additional_info
-------------------------------------------
123    games                XYD
124    Festivals sport      swim

And I need to count matches to arrays like this:

array_content varchar[] := {"Festivals,games","sport,swim"}

If either of the columns description and additional_info contains any of the tags separated by a comma, we count that as 1. So each array element (consisting of multiple words) can only contribute 1 to the total count.

The result for the above example should be:

id    RID    Matches
1     123    1
2     124    2
Community
  • 1
  • 1
Himanshu Mohan
  • 722
  • 9
  • 30
  • I have read it twice but did not understand what you need exactly. Please clarify. – Erwin Brandstetter Nov 02 '16 at 16:09
  • @erwin : I need to match the description and name with the array elements. Over here if any of the tags(separated by comma) present on each array position matches the desc. or title then we consider it as matched = 1. The query which we were executing earlier had only one tag on each array position which was easily separated by unnset. – Himanshu Mohan Nov 02 '16 at 16:22
  • I took the liberty to rewrite your question to clarify. – Erwin Brandstetter Nov 03 '16 at 03:24

1 Answers1

1

The answer isn't simple, but figuring out what you are asking was harder:

SELECT row_number() OVER (ORDER BY t.id) AS id
     , t.id AS "RID"
     , count(DISTINCT a.ord) AS "Matches"
FROM   tbl t
LEFT   JOIN (
   unnest(array_content) WITH ORDINALITY x(elem, ord)
   CROSS JOIN LATERAL
   unnest(string_to_array(elem, ',')) txt
   ) a ON t.description ~ a.txt
       OR t.additional_info ~ a.txt
GROUP  BY t.id;

Produces your desired result exactly.
array_content is your array of search terms.

How does this work?

Each array element of the outer array in your search term is a comma-separated list. Decompose the odd construct by unnesting twice (after transforming each element of the outer array into another array). Example:

SELECT *
FROM   unnest('{"Festivals,games","sport,swim"}'::varchar[]) WITH ORDINALITY x(elem, ord)
CROSS  JOIN LATERAL
       unnest(string_to_array(elem, ',')) txt;

Result:

 elem            | ord |  txt
-----------------+-----+------------
 Festivals,games | 1   | Festivals
 Festivals,games | 1   | games
 sport,swim      | 2   | sport
 sport,swim      | 2   | swim

Since you want to count matches for each outer array element once, we generate a unique number on the fly with WITH ORDINALITY. Details:

Now we can LEFT JOIN to this derived table on the condition of a desired match:

   ... ON t.description ~ a.txt
       OR t.additional_info ~ a.txt

.. and get the count with count(DISTINCT a.ord), counting each array only once even if multiple search terms match.

Finally, I added the mysterious id in your result with row_number() OVER (ORDER BY t.id) AS id - assuming it's supposed to be a serial number. Voilá.

The same considerations for regular expression matches (~) as in your previous question apply:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228