3

Imagine dropping a subscription and recreating it from scratch. Is it possible to ignore existing data during the first synchronization?

Creating a subscription with (copy_data=false) is not an option because I do want to copy data, I just don't want to copy already existing data.


Example: There is a users table and a corresponding publication on the master. This table has 1 million rows and every minute a new row is added. Then we drop the subscription for a day.

If we recreate the subscription with (copy_data=true), replication will not start due to a conflict with already existing data. If we specify (copy_data=false), 1440 new rows will be missing. How can we synchronize the publisher and the subscriber properly?

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
Max Malysh
  • 29,384
  • 19
  • 111
  • 115

2 Answers2

2

You cannot do that, because PostgreSQL has no way of telling when the data were added.

You'd have to reconcile the tables by hand (or INSERT ... ON CONFLICT DO NOTHING).

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
0

Unfortunately PostgreSQL does not support nice skip options for conflicts yet, but I believe it will be enhanced in the feature.

Based on @Laurenz Albe answer which recommends the use of the statement:

 INSERT ... ON CONFLICT DO NOTHING.

I believe that it would be better to use the following command which also will take care any possible updates on your data before you start the subscription again:

INSERT ... ON CONFLICT UPDATE SET...

Finally I have to say that both are dirty solutions as during the execution of the above statement and the creation of the subscription, new lines may have been arrived which will result in losing them until you perform again the custom sync.

I have seen some other suggested solutions using the LSN number from the Postgresql log file...

For me maybe is elegant and safe to delete all the data from the destination table and create the replication again!

Stavros Koureas
  • 1,126
  • 12
  • 34