0

I've seen a lot of similar topics, but no any clear solution.

What i have: Production DB on Amazon RDS. I need to restore/input data to one of the tables in that DB from the dump (.backup) located on my PC.

Here is the table structure:

CREATE TABLE condos
(
  name text NOT NULL,
  latitude numeric,
  longtitude numeric,
  condo_id serial NOT NULL,
  city text NOT NULL,
  country text NOT NULL,
  address text,
  district text,
  CONSTRAINT pk_condo_id PRIMARY KEY (condo_id),
  CONSTRAINT uniq_title UNIQUE (name)
)

The problem: once i'm trying to restore it from local DB, i get a UNIQUE constraint violation error:

pg_restore: [archiver (db)] COPY failed for table "condos": ERROR:  duplicate key value violates unique constraint "pk_condo_id"
DETAIL:  Key (condo_id)=(21) already exists.

Also, there are other tables have relations with the table condos and i don't want to update whole DB.

I know that it's not the first time i'm going to do this procedure,so my question is - what is the best & simple way to update the table and input the data into it (only the data which is not exists in the production DB's table)?

user1935987
  • 3,136
  • 9
  • 56
  • 108
  • Restore data into the another schema, then update your production data from it using SQLs (just one solution from the many possible). – Abelisto Jun 20 '16 at 14:29
  • ' best & simple way to update the table' – user1935987 Jun 20 '16 at 15:50
  • I suspect your `SERIAL` column is causing complications. Can you change it to an integer type for the import, and then reattach it to the sequence afterwards? – jmelesky Jun 20 '16 at 16:01
  • is there any reason of changing it to other time except this merging issue? I mean, from what i see, `SERIAL` usually used for indexing – user1935987 Jun 20 '16 at 16:23
  • `SERIAL` isn't a real type, it's a shortcut for integer defaulting to using a sequence. [The documentation on it is here](https://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-SERIAL). Since they are equivalent, you should be able re-add the `DEFAULT nextval(...)` after your import ([though you'll likely want to use `setval` afterwards](https://www.postgresql.org/docs/current/static/functions-sequence.html)). – jmelesky Jun 20 '16 at 17:50

1 Answers1

1

Simple and best may not always be the same.

Insert via a temporary table

Create another table similar to the one you have let us name it condos_tmp. Import your data into this table. Assuming your dump does not have duplicate entries the import will be successful. Then do:

DELETE FROM condos_tmp WHERE condo_id in (SELECT condo_id FROM condos)

Note that this query will take a very long time if you have millions of records. Next it's a simple insert select

INSERT INTO condos SELECT * FROM condos_tmp

By dropping the keys and then deleting duplicates

This solution can be used even if your dump has duplicate data.

ALTER TABLE condos DROP CONSTRAINT pk_condo_id;
ALTER TABLE condos DROP CONSTRAINT uniq_title;

Now import your data. Then hunt for duplicates. Refer https://wiki.postgresql.org/wiki/Deleting_duplicates and Delete completely duplicate rows in PostgreSQL (and keep only 1) the decision on which one to delete is upto you. These links explain how you can write your query for each situation.

Then add back the constraints.

ALTER TABLE condos ADD CONSTRAINT pk_condo_id PRIMARY KEY(condo_id);
ALTER TABLE condos ADD CONSTRAINT uniq_title UNIQUE(uniq_title);
Community
  • 1
  • 1
e4c5
  • 52,766
  • 11
  • 101
  • 134