1

I have a question about manipulating my main table : I need to add a new column into it, but there is already more than 50 million entries to update.
It would be as easy as an alter table if I wasn't storing my data on a small ssd that is already filled a bit more than half by the database.
The alter table was not able to finish, because it creates a copy of the table and only then replaces the original.

Two copies can't fit on my ssd drive, so is it possible to either operate directly on the original table, or to use another drive for the temporary table ? Or should I use another computer to alter my table on a bigger drive, then only transfer back to my ssd server ?

Thanks in advance !

PS: Time is not really an issue, it's a non realtime system in alpha with almost no user, so I am allowed to take down everything for a few hours (thank the Internet gods !)

Wiz
  • 11
  • 4

2 Answers2

2

PostgreSQL can do in-place ADD COLUMN if the column has no DEFAULT and is nullble.

So: Use the same method you'd use to handle it in a high-concurrency situation, but without the triggers you'd use to handle it with concurrency.

  • ALTER TABLE ... ADD COLUMN ... without any DEFAULT or NOT NULL
  • ALTER TABLE ... ALTER COLUMN ... DEFAULT ... to add the DEFAULT if any
  • UPDATE the table in batches of rows to set the value. VACUUM the table between each batch. Do not use VACUUM FULL. Each batch must be a new, separate transaction that commits before the VACUUM.
  • If desired, ALTER TABLE ... ALTER COLUMN ... NOT NULL after all rows have a value set
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • So the right term is "in-place", great now I have a name on it ! I'm currently updating the table, going to take a while but it seems good, thanks for the answer ! However I was wondering, if I wish to add an index on the column afterwards is it going to take in-place ? It seems logical it would not need to duplicate the table just to make an index, but I'm still asking just to be sure. – Wiz Nov 18 '15 at 10:58
  • @Wiz In PostgreSQL terms we usually talk about avoiding a full-table rewrite. No, indexes don't need to rewrite the whole table. – Craig Ringer Nov 18 '15 at 12:25
0

According to my quick googling Postgres does not support inplace add column. Since time is not an issue for you, you can simulate inplace add column by creating a new table with the new schema, then copying first 100000 rows from the old table to the new table using INSERT ... SELECT, then deleting the first 100000 rows [1] from the old table, and repeating this process until the entire table is copied. Then at the end just drop the old table and rename the new table. This way the space overhead will be constant at all times.

[1] How do I delete a fixed number of rows with sorting in PostgreSQL?

Community
  • 1
  • 1
Ishamael
  • 12,583
  • 4
  • 34
  • 52