0

I’m inserting a large csv file to a postgresql db using bulk_insert_mappings, which I found to be the fastest solution. But I still need to remove rows that already exist in the database as when I inserted a PK duplicate, Nothing happened and it inserted duplicate PK’s. Is there a way where I can check if some rows already exist efficiently?

  • https://www.postgresqltutorial.com/postgresql-upsert/ maybe – Dima Tisnek Jul 14 '20 at 00:53
  • You might want to check out `COPY` (or `copy_from` from psycopg2), it should usually be faster. When you say "it inserted duplicate PK’s", are you sure the PK column actually has the `PRIMARY KEY` attribute? Because having duplicates on the PK column is not possible (short of a bug in postgres, but that seems unlikely). Lastly, as others have said, look up the "ON CONFLICT DO INSTEAD" (i.e 'upsert') syntax. Loading the new data in a temporary table and then upserting it into the existing one should be the easiest/fatest way to do what you want. – Marth Jul 14 '20 at 01:19

0 Answers0