Delete records from Amazon Redshift based on primary key column values stored in a file

Question

I have 14k unique ids in a .txt file, based on which I want to delete data from Amazon RedShift table. I have tried keeping the records inside IN clause, but it is not working. The query keeps on running for a long time

Eg:

delete from <table_name> where <primary_key_column> in (1,2,3....,14000);

score 0 · Answer 1 · answered Sep 03 '20 at 15:27

Referring to this post might help. It looks like using such a large 'in' statement is going to take the Db quite some time to process.

I would personally do something programmatically with a loop and break this down into multiple statements.

You could follow the programmatic approach another way and use a transaction.

Something like:

begin read write;

delete from <table> where <col> = 1
...
delete from <table> where <col> = 14000

commit;

score 0 · Answer 2 · answered Sep 04 '20 at 07:44

0

I would recommend:

Load the text file as a new table
Delete records where the ID is in that table

Something like:

DELETE FROM table1
WHERE id IN (SELECT id FROM table2)

answered Sep 04 '20 at 07:44

John Rotenstein

241,921
22
380
470

Delete records from Amazon Redshift based on primary key column values stored in a file

2 Answers2