I have to migrate a large amount of existing data in a Postgres DB after a schema change.
In the old schema a country attribute would be stored in the users table. Now the country attribute has been moved into a separate address table:
users:
country # OLD
address_id # NEW [1:1 relation]
addresses:
id
country
The schema is actually more complex and the address contains more than just the country. Thus, every user needs to have his own address (1:1 relation).
When migrating the data, I'm having problems setting the foreign keys in the users table after inserting the addresses:
INSERT INTO addresses (country)
SELECT country FROM users WHERE address_id IS NULL
RETURNING id;
How do I propagate the IDs of the inserted rows and set the foreign key references in the users table?
The only solution I could come up with so far is creating a temporary user_id column in the addresses table and then updating the the address_id:
UPDATE users SET address_id = a.id FROM addresses AS a
WHERE users.id = a.user_id;
However, this turned out to be extremely slow (despite using indices on both users.id and addresses.user_id).
The users table contains about 3 million rows with 300k missing an associated address.
Is there any other way to insert derived data into one table and setting the foreign key reference to the inserted data in the other (without changing the schema itself)?
I'm using Postgres 8.3.14.
Thanks
I have now solved the problem by migrating the data with a Python/sqlalchemy script. It turned out to be much easier (for me) than trying the same with SQL. Still, I'd be interested if anybody knows a way to process the RETURNING result of an INSERT statement in Postgres SQL.