Your arrays contain only primitive values, nested documents would be more complicated.
Query
Unnest the JSON arrays of found rows with jsonb_array_elements_text()
in a LATERAL
join and count matches:
SELECT *
FROM (
SELECT *
FROM tbl
WHERE data->'tags' ?| ARRAY['foo', 'bar']
) t
, LATERAL (
SELECT count(*) AS ct
FROM jsonb_array_elements_text(t.data->'tags') a(elem)
WHERE elem = ANY (ARRAY['foo', 'bar']) -- same array parameter
) ct
ORDER BY ct.ct DESC; -- more expressions to break ties?
Alternative with INSTERSECT
. It's one of the rare occasions that we can make use of this basic SQL feature:
SELECT *
FROM (
SELECT *
FROM tbl
WHERE data->'tags' ?| '{foo, bar}'::text[] -- alt. syntax w. array
) t
, LATERAL (
SELECT count(*) AS ct
FROM (
SELECT * FROM jsonb_array_elements_text(t.data->'tags')
INTERSECT ALL
SELECT * FROM unnest('{foo, bar}'::text[]) -- same array literal
) i
) ct
ORDER BY ct.ct DESC;
Note a subtle difference: This consumes each element when matched, so it does not count unmatched duplicates in data->'tags'
like the first variant does. For details see the demo below.
Also demonstrating an alternative way to pass the array parameter: as array literal: '{foo, bar}'
. This may be simpler to handle for some clients:
Or you could create a server side search function taking a VARIADIC
parameter and pass a variable number of plain text
values:
Related:
Index
Be sure to have a functional GIN index to support the jsonb
existence operator ?|
:
CREATE INDEX tbl_dat_gin ON tbl USING gin (data->'tags');
Nuances with duplicates
Clarification as per request in the comment. Say, we have a JSON array with two duplicated tags (4 total):
jsonb '{"tags": ["foo", "bar", "foo", "bar"]}'
And search with an SQL array parameter including both tags, one of them duplicated (3 total):
'{foo, bar, foo}'::text[]
Consider the results of this demo:
SELECT *
FROM (SELECT jsonb '{"tags":["foo", "bar", "foo", "bar"]}') t(data)
, LATERAL (
SELECT count(*) AS ct
FROM jsonb_array_elements_text(t.data->'tags') e
WHERE e = ANY ('{foo, bar, foo}'::text[])
) ct
, LATERAL (
SELECT count(*) AS ct_intsct_all
FROM (
SELECT * FROM jsonb_array_elements_text(t.data->'tags')
INTERSECT ALL
SELECT * FROM unnest('{foo, bar, foo}'::text[])
) i
) ct_intsct_all
, LATERAL (
SELECT count(DISTINCT e) AS ct_dist
FROM jsonb_array_elements_text(t.data->'tags') e
WHERE e = ANY ('{foo, bar, foo}'::text[])
) ct_dist
, LATERAL (
SELECT count(*) AS ct_intsct
FROM (
SELECT * FROM jsonb_array_elements_text(t.data->'tags')
INTERSECT
SELECT * FROM unnest('{foo, bar, foo}'::text[])
) i
) ct_intsct;
Result:
data | ct | ct_intsct_all | ct_dist | ct_intsct
-----------------------------------------+----+---------------+---------+----------
'{"tags": ["foo", "bar", "foo", "bar"]}' | 4 | 3 | 2 | 2
Comparing elements in the JSON array to elements in the array parameter:
- 4 tags match any of the search elements:
ct
.
- 3 tags in the set intersect (can be matched element-to-element):
ct_intsct_all
.
- 2 distinct matching tags can be identified:
ct_dist
or ct_intsct
.
If you don't have dupes or if you don't care to exclude them, use one of the first two techniques. The other two are a bit slower (besides the different result), because they have to check for dupes.