39

I have a pretty big table (around 1 billion rows), and I need to update the id type from SERIAL to BIGSERIAL; guess why?:).

Basically this could be done with this command:

execute "ALTER TABLE my_table ALTER COLUMN id SET DATA TYPE bigint"

Nevertheless that would lock my table forever and put my web service down.

Is there a quite simple way of doing this operation concurrently (whatever the time it will take)?

Pierre Michard
  • 1,341
  • 1
  • 12
  • 16

4 Answers4

36

If you don't have foreign keys pointing your id you could add new column, fill it, drop old one and rename new to old:

alter table my_table add column new_id bigint;

begin; update my_table set new_id = id where id between 0 and 100000; commit;
begin; update my_table set new_id = id where id between 100001 and 200000; commit;
begin; update my_table set new_id = id where id between 200001 and 300000; commit;
begin; update my_table set new_id = id where id between 300001 and 400000; commit;
...

create unique index my_table_pk_idx on my_table(new_id);

begin;
alter table my_table drop constraint my_table_pk;
alter table my_table alter column new_id set default nextval('my_table_id_seq'::regclass);
update my_table set new_id = id where new_id is null;
alter table my_table add constraint my_table_pk primary key using index my_table_pk_idx;
alter table my_table drop column id;
alter table my_table rename column new_id to id;
commit;
Radek Postołowicz
  • 4,506
  • 2
  • 30
  • 47
  • 2
    Thanks, this solution is pretty elegant. It seems to me that there is still a problem. When new rows will be inserted into our table while we are filling the new_id columns, the new_id value won't be set and the unique index creation may fail. Can we add a trigger setting new_id at insertion during until we add the nexval default value on it? – Pierre Michard Nov 04 '15 at 09:35
  • Unique index ignores `null` values, so there is no need for trigger. – Radek Postołowicz Nov 04 '15 at 11:28
  • What if `alter table my_table add column new_id bigint;` takes long time (It takes more than 1 hour and not finished yet), and block other read operations? – Cheok Yan Cheng Feb 14 '17 at 10:15
  • Honestly: I don't know :). Maybe ask separate question how it is possible to speed up adding column itself? – Radek Postołowicz Feb 14 '17 at 16:47
  • 2
    When you say "If you don't have foreign keys pointing your id", what does that mean? Do you mean this solution won't work if any other table have a column that is a foreign key to this table? – TheJKFever Jan 10 '19 at 02:33
  • I'm getting the following error: ``` PG::DependentObjectsStillExist: ERROR: cannot drop table users column id because other objects depend on it DETAIL: default for table users column new_id depends on sequence users_id_seq ``` – TheJKFever Jan 11 '19 at 03:12
  • It seems that this line `alter table my_table alter column new_id set default nextval('my_table_id_seq'::regclass);` adds a dependency that prevents dropping id. – TheJKFever Jan 11 '19 at 03:36
  • 1
    This solution is not concurrent. The `alter table my_table add constraint my_table_pk primary key using index my_table_pk_idx;` adds a NOT NULL constraint onto the primary key column which requires a table scan to verify the constraint. – James Stonehill Sep 03 '20 at 11:55
  • > It seems that this line ... adds a dependency that prevents dropping id. I'd add to the code block the following command after _this_ line: `ALTER SEQUENCE my_table_id_seq OWNED BY my_table.new_id;` – denis-sumin Mar 04 '22 at 22:08
  • @JamesStonehill did you find a concurrent solution in the end? – brauliobo Aug 29 '22 at 00:14
  • Best way is to create a new table. If you can then you should set the old table to read only while you copy the data across. If you can’t then the best thing to do is to use triggers to duplicate writes across to the new table while your backfilling the new table. It’s v messy though. – James Stonehill Sep 03 '22 at 09:52
4

Radek's solution looks great. I would add a comment if I had the reputation for it, but I just want to mention that if you are doing this you'll likely want to widen the sequence for the primary key as well.

ALTER SEQUENCE my_table_id_seq AS bigint;

If you just widen the column type, you'll still end up with problems when you hit 2 billion records if the sequence is still integer sized.

I think the issue that James points out about adding the primary key requiring a table scan can be solved with the NOT VALID/VALIDATE dance. Instead of doing alter table my_table add constraint my_table_pk primary key using index my_table_pk_idx;, you can do

ALTER TABLE my_table ADD UNIQUE USING INDEX my_table_pk_idx;
ALTER TABLE my_table ADD CONSTRAINT my_table_id_not_null CHECK (id IS NOT NULL) NOT VALID;
ALTER TABLE my_table VALIDATE CONSTRAINT my_table_id_not_null;

I think it's also worth mentioning that

create unique index my_table_pk_idx on my_table(new_id);

will do a full table scan with an exclusive lock on my_table. It is better to do

CREATE UNIQUE INDEX CONCURRENTLY ON my_table(new_id);
3

Merging both @radek-postołowicz and @ethan-pailes answers for a full concurrent solution, with some tweaks we get:

alter table my_table add column new_id bigint;

-- new records filling
CREATE FUNCTION public.my_table_fill_newid() RETURNS trigger
LANGUAGE plpgsql AS $$                                                                                                                                                                                    
DECLARE
  record record;
BEGIN
  new.new_id = new.id;
  return new;
END;                                                                                                                                                                                         
$$;
CREATE TRIGGER my_table_fill_newid BEFORE INSERT ON my_table
  FOR EACH ROW EXECUTE FUNCTION public.my_table_fill_newid();

-- old records filling
update my_table set new_id = id where id between 0 and 100000;
update my_table set new_id = id where id between 100001 and 200000;
update my_table set new_id = id where id between 200001 and 300000;
...

-- slow but concurrent part
create unique index concurrently my_table_pk_idx on my_table(new_id);
ALTER TABLE my_table ADD CONSTRAINT my_table_new_id_not_null
  CHECK (new_id IS NOT NULL) NOT VALID; -- delay validate for concurrency
ALTER TABLE my_table VALIDATE CONSTRAINT my_table_new_id_not_null; 

-- locking
begin;
ALTER TABLE my_table alter column new_id set not null; -- needed for pkey
ALTER TABLE my_table drop constraint my_table_new_id_not_null;
ALTER SEQUENCE my_table_id_seq AS bigint;

alter table my_table drop constraint my_table_pk;
alter table my_table add constraint my_table_pk primary key using index my_table_pk_idx;

alter table my_table drop column id;
alter table my_table rename column new_id to id;

drop trigger my_table_fill_newid on my_table;
commit;
brauliobo
  • 5,843
  • 4
  • 29
  • 34
0

I tried @radek-postołowicz solution, but it failed for me as I needed to set the new_id column as not null, and that locks the table for a long time.

My solution:

  1. Select records from the old table, and insert it into a new table my_table_new with id being bigint. Run this as a standalone transaction.
  2. In another transaction: do the step 1) again for the records which could have been created in the meantime, drop my_table and rename my_table_new to my_table.

The downside of this solution is that it auto-scaled the storage of my AWS RDS, and it could not be scaled back.

xhafan
  • 2,140
  • 1
  • 26
  • 26