0

I have a schema like the following (just an example):

CREATE TABLE example (
  id    int primary key
, text1 text
, text2 text
, text3 text
, text4 text
, text5 text
, text6 text
);

Question:

I'm trying to run a pattern matching search on the 6 text columns using the ~* operator. Can I query the example table by the count of appearances of my pattern in the 6 columns?

Example:

I want to find all rows that have said keyword in at least 3 of the 6 text columns. Whichever of those 6 columns does not matter.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
user2737876
  • 1,038
  • 2
  • 12
  • 20

2 Answers2

2

You can find a number of matches in the simple way like this:

insert into example values (1, 'a', 'a', 'a', 'b', 'c', 'd');

select 
    id, 
    (text1 ~* 'a')::int+ (text2 ~* 'a')::int+ 
    (text3 ~* 'a')::int+ (text4 ~* 'a')::int+ 
    (text5 ~* 'a')::int+ (text6 ~* 'a')::int as matches
from example;

 id | matches 
----+---------
  1 |       3

If performance is not a key issue in the case, you can use maybe a bit more convenient but slower query:

select id, sum((t ~* 'a')::int) as matches
from example,
lateral unnest(array[text1, text2, text3, text4, text5, text6]) as t
group by 1;

You can use the expressions in WHERE (query #1) or HAVING (query #2) clauses:

select *
from example
where
    (text1 ~* 'a')::int+ (text2 ~* 'a')::int+ 
    (text3 ~* 'a')::int+ (text4 ~* 'a')::int+ 
    (text5 ~* 'a')::int+ (text6 ~* 'a')::int > 3;

select e.*
from example e,
lateral unnest(array[text1, text2, text3, text4, text5, text6]) as t
group by e.id
having sum((t ~* 'a')::int) > 3;
klin
  • 112,967
  • 15
  • 204
  • 232
  • Hi, thanks for you answer. I don't think my question outlined this but im trying to select by the minimum count. So in examples you have provided im not able to do that. Im looking for a query where I can say "where matches > 3" or something like that. Any way to modify your query to reflect this? I attempted but a little stumped. – user2737876 Dec 12 '16 at 02:59
0

Like klin posted, but this should be simpler and faster:

SELECT e.*
FROM   example e
JOIN   LATERAL (
   SELECT count(*) FILTER (WHERE x.t ~* 'keyword') AS ct
   FROM  (VALUES(text1), (text2), (text3), (text4), (text5), (text6)) x(t)
   ) sub ON sub.ct >= 3;  -- your minimum number of columns here

About VALUES in a LATERAL join:

About aggregate FILTER:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228