Effiecient for/each loop to match phrases?

Question

I am going to use a for/each loop, to search different names (table1) among textual information of records in another table (table2) using regular expressions.

SELECT id FROM "table1"
where tags ~* 'south\s?\*?africa'
   or description ~* 'south\s?\*?south'
order by id asc;

but I do not know how to put it in a for each loop!

table1:

 t1ID | NAME
 1    | Shiraz      
 2    | south africa
 3    | Limmatplatz

table2:

t2ID |TAGS                   | DESCRIPTIONS
101  |shiraz;Zurich;river    | It is too hot in Shiraz and Limmatplatz
201  |southafrica;limmatplatz| we went for swimming

I have a list of names in table1. Another table has some text information that might contain those names. I would like to get back the id of table2 that contains items in table1 with the id of the items.

For example:

t2id | t1id
101  |1
101  |3
201  |2
201  |3

My tables have 60,000 and 550.000 rows. I need to use a way that time wise be efficient!

score 1 · Accepted Answer · edited May 23 '17 at 12:16

1

You don't need a loop. A simple join works.

SELECT t2.id AS t2id, t1.id AS t1id
FROM   table1 t1
JOIN   table1 t2 ON t2.tags        ~* replace(t1.name, ' ', '\s?\*?')
                 OR t2.description ~* replace(t1.name, ' ', '\s?\*?')
ORDER  BY t2.id;

But performance will be terrible for big tables.
There are several things you can do to improve it:

Normalize table2.tags into a separate 1:n table.
Or an n:m relationship to a tag table if tags are used repeatedly (typical case). Details:
- How to implement a many-to-many relationship in PostgreSQL?
Use trigram or textsearch indexes
- PostgreSQL LIKE query performance variations
Use a LATERAL join to actually use those indexes.
- LATERAL JOIN not using trigram index
Ideally, use the new capability in Postgres 9.6 to search for phrases with full text search. The release notes:

Full-text search can now search for phrases (multiple adjacent words)

edited May 23 '17 at 12:16

Community

1
1

answered Sep 08 '16 at 13:46

Erwin Brandstetter

605,456
145
1,078
1,228

thank you for the reply! I am new in Postgresql to use the multiple adjacent words!:( – GeoBeez Sep 08 '16 at 14:23
I wanted to do search in java, but I thought it might be faster in database! – GeoBeez Sep 08 '16 at 14:29
@Raha1986: Pattern matching is a complex matter. Details of the requirements matter. If done right, your RDBMS (especially Postgres) will execute it *much* faster than any other instance in your tool chain. – Erwin Brandstetter Sep 08 '16 at 14:48
Means the best way is using the new capability of Postgres 9.6!right? is it the most efficient way? – GeoBeez Sep 08 '16 at 16:09
@Raha1986: Probably yes - in combination with my other advice. – Erwin Brandstetter Sep 08 '16 at 16:14

Effiecient for/each loop to match phrases?

1 Answers1