0

I have an SQL DB (Postgree) with ~ 2000000 records in a table. And I have a CSV file with ~ 500000 rows. I need to check all records in CSV and add them to DB if they were not present before . I should check them by name field.

My question is what is the fastest way to do that?

Thanks

GhostKU
  • 1,898
  • 6
  • 23
  • 32
  • https://stackoverflow.com/questions/48019381/how-postgresql-copy-to-stdin-with-csv-do-on-conflic-do-update could be helpful. – Ilja Everilä Jul 13 '18 at 12:58

1 Answers1

0
  1. Load your new rows into a staging table.
  2. Index your staging and target tables on the name column
  3. write a query like this:

    INSERT INTO target (column list) SELECT column list FROM source LEFT JOIN target ON target.name = source.name WHERE target.name IS NULL

or

INSERT INTO target (<column list>)
SELECT <column list> FROM source
WHERE NOT EXISTS (
    SELECT * from target 
    WHERE target.name = source.name
)
user1443098
  • 6,487
  • 5
  • 38
  • 67