0

I have table1 which contains around 1mln line of records and table2 around 3mln line of data.

I have received the below-mentioned code from my previous post (link) which is updating the table as expected, but is taking more than an hour to update just 100 lines of records.

do $$ 
declare 
  rec record;
  current_joins text;
  current_result int;
begin
  for rec in select joins from table1 loop
     select rec.joins into current_joins;
     execute format('select count(*) 
                     from (
                        select 1
                        from table2 
                        group by %1$s 
                        having count(*)>=1 
                     ) as some_alias;',
                     current_joins) 
              into current_result;
     update table1 set result=current_result where joins=current_joins;
  end loop;
end $$;

Appreciate if somebody could provide me any alternative code to handle the above requirements.

The feature of the table can be seen from the following link (https://dbfiddle.uk/pP_9mKy3) and the previous post can be seen from the following link

Ken White
  • 123,280
  • 14
  • 225
  • 444
Gulya
  • 101
  • 9
  • Why is the `having count(*)>=1` needed? It seems kinda pointless to me. (And [the result appears to be the same without](https://dbfiddle.uk/EApjdGc8)) – Bergi Jan 01 '23 at 01:25
  • 1
    Instead of selecting `count(*)` from a group-by subquery, I would just use `select count(distinct %1$s) from table2`. (Admittedly it [was reported to be slow](https://stackoverflow.com/questions/11250253/postgresql-countdistinct-very-slow), no idea whether that's been fixed). With this approach, you can also do all the different `count()`s in a single query (as different columns) - not sure whether that might speed it up? – Bergi Jan 01 '23 at 01:32
  • 1
    What are the query plans for these queries? Do you have any indices on the relevant columns to speed this up? – Bergi Jan 01 '23 at 01:33
  • @ Bergi tried with the following code `select count(distinct %1$s) from table2` but still taking more than hours to complete just a few rows. What if we concat/join all the columns of table2 based on the criteria in table1.joins in a separate table then counting them? but i think it will be a huge data and takes significant time to join them? – Gulya Jan 01 '23 at 15:30

0 Answers0