First of all, with 3 columns of ~ 300k text the table should not be over 1 GB in size with just 100 rows. Do you have autovacuum
enabled? Or do you run vacuum regularly? Read about vacuum in the manual. Check your settings.
As to the question for UPDATE
versus INSERT
/ DELETE
: due to PostgreSQL's MVCC model every UPDATE
is effectively a new INSERT
. The old row will be kept and eventually purged by (auto
)vacuum
. Use UPDATE
, it's simpler. The key to recovering space / performance is vacuuming.
However, if you have to process all or most rows, and can afford to lock the table exclusively (or there is no concurrent access), it will be faster to rewrite the table completely, either by creating a new one, or like this:
There is also the related command CLUSTER
. Rewrites the whole table and all indexes in perfect condition. Beats VACUUM FULL
, because it is effectively also a REINDEX
. You can try that once to see the real size of the table without dead tuples.
Concerning performance: PostgreSQL writes the (compressed) content of big columns to TOAST tables. So standard queries perform the same, regardless of whether a column holds 30k or 30MB in size. It only affects the time to retrieve the rows found.
As out-of-line (TOASTed) data also gets compressed, your table should not be over 1 GB (including the TOAST table), even though on-disc representation in the database normally occupies 2-3 times the space of the raw data. This normally means, that you are not vacuuming and a lot of dead tuples clutter up your table.
One side effect of TOAST tables is that UPDATE
can perform better than INSERT
/ DELETE
because:
During an UPDATE operation, values of unchanged fields are normally
preserved as-is; so an UPDATE of a row with out-of-line values incurs
no TOAST costs if none of the out-of-line values change.
Shouldn't matter in your case, though, as your plan is to NULL the big columns anyway.