I need to delete the majority (say, 90%) of a very large table (say, 5m rows). The other 10% of this table is frequently read, but not written to.
From "Best way to delete millions of rows by ID", I gather that I should remove any index on the 90% I'm deleting, to speed up the process (except an index I'm using to select the rows for deletion).
From "PostgreSQL locking mode", I see that this operation will acquire a ROW EXCLUSIVE
lock on the entire table. But since I'm only reading the other 10%, this ought not matter.
So, is it safe to delete everything in one command (i.e. DELETE FROM table WHERE delete_flag='t'
)? I'm worried that if the deletion of one row fails, triggering an enormous rollback, then it will affect my ability to read from the table. Would it be wiser to delete in batches?