0

I need to remove several hundred thousand rows from my table. I know the primary keys of all the rows I wish to remove (they are actually the results from another query). What is the fastest way to do this? This is what I have so far:

DELETE FROM my_table
WHERE id IN (SELECT id FROM another_table);

I know what I have will work, but it takes a very long time because the inner query (SELECT id FROM another_table) has several hundred thousand rows. This means it has to search through several hundred thousand results from another_table, repeated several hundred thousand times in my_table.

What would be faster is just making a new DELETE once for each item in the results from SELECT id FROM another_table. I could do this with the aid of a small python script, but I want to avoid introducing Python into my project stack. Is there any way to do this with SQL scripting?

mevers303
  • 442
  • 3
  • 10
  • 1
    Can you post the execution plan of the query as it is? – The Impaler Nov 04 '19 at 02:57
  • 1
    It doesn't mean 'repeated several hundred thousand times'. The result set of the inner SELECT should be traversed, not 'searched', exactly once. What you claim would be faster would in fact be much slower. – user207421 Nov 04 '19 at 03:59
  • Sometimes [the optimizer is smarter](https://dev.mysql.com/doc/refman/8.0/en/subquery-optimization-with-exists.html) than one might give him credit for. As The Impaler said: check (or add, so we can help you check) the execution plan if it is showing something strange. – Solarflare Nov 04 '19 at 07:43

0 Answers0