-2

i'm manually updating table in my DB using import functionality in PostgreSQL. Working with large number of data i came across an issue of duplicating primary keys. I am looking for a script to upload only values that do not violate primary key assumption, and those that to violate are to be ignored (not uploaded or updated).

I have already seen a code that would kind of do what i need however not quite sure if it will work for me.

Columns i am working with are: acc_ph_id (primary_key);acc_ph_no;ph_type;ph_flag

Any suggestions will be highly appreciated as i am rather new to Postgresql in general.

user2959843
  • 77
  • 1
  • 1
  • 2

1 Answers1

0

Upload the table into a staging table with no constraints.

Then load the table into the full table eliminating duplicates:

insert into real_table(acc_ph_id, acc_ph_no, ph_type, ph_flag)
    select distinct on (acc_ph_id) acc_ph_id, acc_ph_no, ph_type, ph_flag
    from staging_table
    order by acc_ph_id;

EDIT:

Oh, if the problem is the keys that already exist, then do:

insert into real_table(acc_ph_id, acc_ph_no, ph_type, ph_flag)
    select distinct on (acc_ph_id) acc_ph_id, acc_ph_no, ph_type, ph_flag
    from staging_table st
    where not exists (select 1
                      from real_table rt
                      where rt.acc_ph_id = st.acc_ph_id
                     )
    order by acc_ph_id;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thanks Gordon, will this work in case if there are not duplicate primary keys in staging_table but somewhere between staging_table and real_table? thanks – user2959843 Dec 28 '13 at 16:48
  • That is correct, there are primary keys in "real_table" that are also present in the "staging_table". Thanks alot for you prompt response. I'll try your solution and will update this post. – user2959843 Dec 28 '13 at 16:55
  • 1
    Strongly recommend a `LOCK TABLE` in the transaction doing this work. – Craig Ringer Dec 29 '13 at 03:19