2

I'm trying to aggregate data in a concise way from multiple tables. I'm not very experienced with Postgres or SQL, however, so I was hoping I could get some help from here!

The situation is as follows: I have three columns containing text, from two different tables. Let's call them table A and B. I want to search A.text1, A.text2 and B.text3 for the wildcard '%postgres%'. I want the result to return all the instances containing that wildcard to me.

What I would like is a result that looks something like this:

Type Text
text1 ... postgres ...
text2 ... postgres ...
text3 ... postgres ...

My question is: is this feasible, and if it is feasible, can you point me in the right direction to achieve such a result?

The current solution I have created for this is:

select a.text1, a.text2, b.text3
from a
inner join b
on a.id = b.aId
where a.text1 like '%postgres%'
or a.text2 like '%postgres%'
or b.text3 like '%postgres%'

This works, but the result is kind of ugly. There are a lot of cells that contain '[null]' and I don't really like that. It looks like this:

a.text1 a.text2 b.text3
null ... postgres ... null
... postgres ... ... postgres ... null
null null ... postgres ...

Update: I have solved the problem! The sql I ended up with is:

select * from
(select 'text1' as type, text1 as text from a
union all
select 'text2' as type, text2 as text from a
union all
select 'text3' as type, text3 as text from b) as result
where result.text like '%postgres%'
Timo
  • 19
  • 6
  • 2
    Perhaps UNION ALL should be considered? – jarlh May 11 '23 at 11:10
  • 2
    @jarlh Thank you for the comment. Using unions, I have solved the problem! – Timo May 11 '23 at 12:09
  • Well-formulated question and you even went through lengths of styling the tables! Lovely work and welcome! – ak.leimrey May 11 '23 at 12:19
  • 1
    I agree, well done for a first question! Please remember to always disclose your version of Postgres (`SELECT version();` helps.) And if performance matters, consider https://stackoverflow.com/tags/postgresql-performance/info – Erwin Brandstetter May 11 '23 at 23:59

1 Answers1

1

Your query with UNION ALL basically answers the question. I would suggest some optimization:

SELECT t.*
FROM   a
JOIN   LATERAL (
   VALUES
     ('text1', text1)
   , ('text2', text2)
   ) t (type, "text") ON "text" ~ 'postgres'

UNION ALL
SELECT 'text3', text3
FROM   b
WHERE  b.text3 ~ 'postgres';

Table a is only scanned once instead of twice this way, which is about half the cost. And all rows are filtered early - also cheaper.

About LATERAL ( ... VALUES ...):

"text" ~ 'postgres' is equivalent to "text" LIKE '%postgres%'. Only special characters in the search string might make a difference. See:

If tables are big, trigram indexes on the text columns help a lot. See:

I double-quoted the identifier "text" to avoid collisions with the basic type name text. Rather avoid basic type names as identifier altogether.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Very informative, thanks for the extensive answer! A lateral join on the two fields is a great optimization, and I'll make sure to learn what I can from the links you provided. – Timo May 13 '23 at 06:03