I'd like to backfill a column of a large (20M rows), frequently-read but rarely-written table. From various articles and questions on SO, it seems like the best way to do this is create a table with identical structure, load in the backfilled data, and live-swap (since renaming is pretty quick). Sounds good!
But when I actually write the script to do this, it is mind-blowingly long. Here's a taste:
BEGIN;
CREATE TABLE foo_new (LIKE foo);
-- I don't use INCLUDING ALL, because that produces Indexes/Constraints with different names
-- This is the only part of the script that is specific to my case.
-- Everything else is standard for any table swap
INSERT INTO foo_new (id, first_name, last_name, email, full_name)
(SELECT id, first_name, last_name, email, first_name || last_name) FROM foo);
CREATE SEQUENCE foo_new_id_seq
START 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
SELECT setval('foo_new_id_seq', COALESCE((SELECT MAX(id)+1 FROM foo_new), 1), false);
ALTER SEQUENCE foo_new_id_seq OWNED BY foo_new.id;
ALTER TABLE ONLY foo_new ALTER COLUMN id SET DEFAULT nextval('foo_new_id_seq'::regclass);
ALTER TABLE foo_new
ADD CONSTRAINT foo_new_pkey
PRIMARY KEY (id);
COMMIT;
-- Indexes are made concurrently, otherwise they would block reads for
-- a long time. Concurrent index creation cannot occur within a transaction.
CREATE INDEX CONCURRENTLY foo_new_on_first_name ON foo_new USING btree (first_name);
CREATE INDEX CONCURRENTLY foo_new_on_last_name ON foo_new USING btree (last_name);
CREATE INDEX CONCURRENTLY foo_new_on_email ON foo_new USING btree (email);
-- One more line for each index
BEGIN;
ALTER TABLE foo RENAME TO foo_old;
ALTER TABLE foo_new RENAME TO foo;
ALTER SEQUENCE foo_id_seq RENAME TO foo_old_id_seq;
ALTER SEQUENCE foo_new_id_seq RENAME TO foo_id_seq;
ALTER TABLE foo_old RENAME CONSTRAINT foo_pkey TO foo_old_pkey;
ALTER TABLE foo RENAME CONSTRAINT foo_new_pkey TO foo_pkey;
ALTER INDEX foo_on_first_name RENAME TO foo_old_on_first_name;
ALTER INDEX foo_on_last_name RENAME TO foo_old_on_last_name;
ALTER INDEX foo_on_email RENAME TO foo_old_on_email;
-- One more line for each index
ALTER INDEX foo_new_on_first_name RENAME TO foo_on_first_name;
ALTER INDEX foo_new_on_last_name RENAME TO foo_on_last_name;
ALTER INDEX foo_new_on_email RENAME TO foo_on_email;
-- One more line for each index
COMMIT;
-- TODO: drop old table (CASCADE)
And this doesn't even include foreign keys, or other constraints! Since the only part of this that is specific to my case in the INSERT INTO
bit, I'm surprised that there's no built-in Postgres function to do this sort of swapping. Is this operation less common than I make it out to be? Am I underestimating the variety of ways this can be accomplished? Is my desire to keep naming consistent an atypical one?