0

I have an database on aws' rds and I use a pg_dump from a local version of the database, then psql dbname > pg_dump_file with proper arguments for remote upload to populate the database.

I'd like to know what is expected to happen if that rds db already contains data. More specifically:

  1. Data present in the local dump, but absent in rds
  2. Data present on rds, but absent in the local data
  3. Data present in both but that have been modified

My current understanding:

  1. New data will be added and be present in both after upload
  2. Data in rds should be unaffected?
  3. The data from the pg_dump will be present in both (assuming the same pk, but different fields otherwise)

Is that about correct? I've been reading this, but it's a little thin on how the restore is actually performed, so I'm having a harder time figuring that out. Thanks.

EDIT: following @wildplasser comment, by looking at the pg_dump file it appears that the following happens:

CREATE TABLE [....]
ALTER TABLE [setting table owner]
ALTER SEQUENCE [....]

For each table in the db. Then, again one table at a time:

COPY [tablename] (list of cols) FROM stdin;
[data to be copied]

Finally, more ALTER statements to set contraints, foreign keys etc.

So I guess the ultimate answer is "it depends". One could I suppose remove the CREATE TABLE [...], ALTER TABLE, ALTER SEQUENCE statements if those are already created as they should. I am not positive yet what happens if one tries CREATE TABLE with an existing table (error thrown perhaps?).

Then I guess the COPY statements would overwrite whatever already exists. Or perhaps throw an error. I'll have to test that. I'll write up an answer once I figure it out.

logicOnAbstractions
  • 2,178
  • 4
  • 25
  • 37
  • 1
    Your `pg_dump_file` contains plain SQL. You can inspect it to see what it does. (it could or could not contain a `DROP database` or `CREATE database` statement) – wildplasser May 31 '22 at 13:28
  • "with proper arguments for remote upload". I know what the arguments do. I don't know which ones you consider proper for that purpose. Please just tell us what you did. – jjanes May 31 '22 at 13:36
  • @wildplasser oh right, that makes a lot of sense, I'll inspect that. – logicOnAbstractions May 31 '22 at 13:36
  • @jjanes well the upload of the data dump works - I just want to know how does it work of existing data before I try it. The question isn't about how to upload to rds using psql. – logicOnAbstractions May 31 '22 at 13:37

1 Answers1

0

So the answer is a bit dull. Turns out that even if one removes the initial statements before the copy, if the table as an primary key (thus uniqueness constrains) then it won't work:

ERROR: duplicate key value violates unique constraint

So one gets shutdown pretty quickly there. One would have I guess to rewrite the dump as a list of UPDATE statements instead, but then I guess might as well write a script to do so. Unsure if pg_dump is all that useful in that case.

logicOnAbstractions
  • 2,178
  • 4
  • 25
  • 37