I am trying to delete duplicated rows from a table but seems that is not working.
The query
DELETE FROM notifications N1
USING notifications N2
WHERE N1.id < N2.id
AND N1.user_id = N2.user_id
AND N1.notification_type = N2.notification_type;
Some SGDBs says that there is no relation to notifications table. If I run in another app it takes like 1hr to delete them but looks like in the end it never commit and goes back to the same state before. The database is an AWS RDS instance and the table contains more than 40 million rows, and I expect to have less than 1 mi in the end.
Assumptions:
- There is a Postgres lock for huge delete query.
- Because of the size of the table, it drops the connection and never commits.
- Because of the size and the table usability, I'll have to add a lock to the query.
- Maybe I'll have to duplicate the table, make the delete and then replace the old table with the new table.
I am not sure what approach I must follow.