14

I have a requirement of merging two databases (Database1 & Database2) to a common database in Postgresql.

Database1

Table1

Id - Value (Id is Primary key)

1 - a
2 - aa
3 - aaa
4 - aaaa

Database2

Table1

Id Value (Id is Primary key)

2 - bb
5 - bbbbb

I want my output as

OutPutDatabase

Table1

Id Value (Id is Primary key)

1 - a
2 - bb
3 - aaa
4 - aaaa
5 - bbbbb

How can I achieve this?

Girish
  • 389
  • 2
  • 4
  • 12
  • How are you deciding what the id should be in the merged database - does it matter? Are there foreign keys referencing these tables, and if so what do you want to do then? – Richard Huxton Feb 29 '12 at 13:12

4 Answers4

8

First, load the tables into two separate schemas in the same database.

CREATE SCHEMA db1;
CREATE TABLE db1.table1 (id INT PRIMARY KEY, value VARCHAR(10));
-- load the contents of Table1 @ Database1 into db1.table1

CREATE SCHEMA db2;
CREATE TABLE db2.table1 (id INT PRIMARY KEY, value VARCHAR(10));
-- load the contents of Table1 @ Database2 into db2.table1

Then you can merge the two, by prioritizing db2 over db1.

SELECT
  DISTINCT ON (id)
  id,
  value
FROM (
  SELECT
    *,
    1 AS db
  FROM
    db1.table1

  UNION

  SELECT
    *,
    2 AS db
  FROM
    db2.table1) AS mix
ORDER BY
  id,
  db DESC;
Kouber Saparev
  • 7,637
  • 2
  • 29
  • 26
5

Not my idea, but one I have read in the past.

The source: Move table to new database

pg_dump -d db1 -t table1 |psql db2

then psql and do

insert into table2 (select * from table1);
Kuberchaun
  • 29,160
  • 7
  • 51
  • 59
2

I used the migrate_pkey_sequence() function defined here to migrate the primary keys of multiple tables (with the same schema but different values from different databases) to distinct ranges so that I could merge the tables.

This is the approximate sequence I followed to merge two databases, d1 and d2, each of which has a table example:

  1. Dump d1 and d2 to the respective files d1.sql.gz and d2.sql.gz. This is the command I used to dump:

    $ pg_dump --no-owner | gzip > d1.sql.gz
    
  2. Create a new, empty database on a local PostgreSQL server.

  3. Load d1.sql.gz.
  4. Migrate d1.example using:

    # SELECT migrate_pkey_sequence('example', 'id', 'example_id_seq', 1);
    INFO:  00000: UPDATE example SET id = nextval('example_id_seq') + 0
    INFO:  00000: ALTER SEQUENCE example_id_seq RESTART WITH 1
    INFO:  00000: UPDATE example SET id = DEFAULT
     migrate_pkey_sequence
    -----------------------
                         4
    (1 row)
    
  5. Record the value printed (4 in this example). That will be the start of the next sequence.

  6. Dump to the file d1-new.sql.gz in the same way as above.
  7. Repeat steps 2 through 4 with d2.sql.gz but use the value from step 5 as the argument to migrate_pkey_sequence():

    # SELECT migrate_pkey_sequence('example', 'id', 'example_id_seq', 4);
    
  8. Dump to the file d2-new-without-schema.sql.gz without saving the schema and disable triggers. This is the command I used:

    $ pg_dump --no-owner --data-only --disable-triggers | \
      gzip > d2-new-without-schema.sql.gz
    
  9. Create a new, empty database.

  10. Load d1-new.sql.gz and d2-new-without-schema.sql.gz in order. If everything worked as expected, the primary keys should be disjoint and not conflict.

  11. Dump the database if you want to load it into a remote server.

Community
  • 1
  • 1
Sean Leather
  • 1,182
  • 1
  • 9
  • 25
  • As an added note, for migrate_pkey_sequence to work properly with foreign keys also make sure you have the keys set as ON UPDATE CASCADE – Sabrina Leggett Sep 07 '22 at 22:14
0

I have been able to easily merge 2 postgres database using pg_dump & pg_restore.

In my case, it was easy as I use UUID identifiers, so there are no duplicated keys.

Export (data-only, excluding some schemas and tables):

pg_dump --dbname=database_1 --username=${POSTGRES_USER} --no-owner --format=tar --data-only --exclude-schema=hdb_views --exclude-schema=hdb_catalog --exclude-table=app_configuration --exclude-table=migrations --file=/database_1.data-only.dump

Import (changing role if necessary):

pg_restore --username=$POSTGRES_USER --role=$POSTGRES_USER --no-owner --exit-on-error --verbose --dbname=database_2 /database_1.data-only.dump

Nico Toub
  • 1,528
  • 15
  • 17