0

I two tables table1 and table2 in postgresql-12:

DELETE FROM table2 WHERE (ddate, itemno) NOT IN (SELECT ddate,itemno FROM table1);

table1 has around 7000000 entries. table2 has around 9500000 entries. Around 100000 matches are expected. This command takes 5 hours to execute. How can I optimize this query?

Adrian Klaver
  • 15,886
  • 2
  • 17
  • 28
Sourav
  • 129
  • 12
  • 1
    It seems this [answer](https://dba.stackexchange.com/a/35007) on a similar question could help you. – Jonas Metzler Jul 31 '23 at 08:21
  • 3
    Maybe it's already as optimised as it can be? Nobody can tell because you didn't bother to include the query plan. It's never going to be super-fast you have to scan at least one of the tables completely. – Richard Huxton Jul 31 '23 at 08:37
  • @RichardHuxton how do I search for the query plan? Please bear with me, I am new to postgresql – Sourav Jul 31 '23 at 08:43
  • @JonasMetzler how do I convert this NOT IN part to IN part, as mentioned in the link? – Sourav Jul 31 '23 at 08:44
  • Also, triggers are disabled for both the tables – Sourav Jul 31 '23 at 08:44
  • OK, before you go any further google a bit around "postgresql query plan explain". There are two variants of that command you will want the plain EXPLAIN. – Richard Huxton Jul 31 '23 at 08:46
  • @Sourav Use LEFT JOIN with IS NULL as explained in the answer rather than NOT IN, put it in a CTE or subquery. – Jonas Metzler Jul 31 '23 at 08:58
  • See [Best way to delete millions of rows](https://stackoverflow.com/questions/8290900/best-way-to-delete-millions-of-rows-by-id/8290958#8290958) – Ergest Basha Jul 31 '23 at 09:34

1 Answers1

0

To answer your question you need to provide more information about the tables or the query plan as @Richard Huxton just said, you can get the query plan by including EXPLAIN before any command you want to get its query plan, in your example it will be:

EXPLAIN DELETE FROM table2 WHERE (ddate, itemno) NOT IN (SELECT ddate,itemno FROM table1);

Do that and include it in the question, however if you didn't use indexes on your tables, you should try to do so, this may result in a significant speed up of the query. Check the docs here

ahmed_131313
  • 142
  • 6