0

When uploading a bunch of CSVs to my Flask application, I'd like to be able to bulk entry these into my sqlalchemy table.

status    | medium   | landing page
converted | google   | www.example.com
...

Pandas has a method to_sql which can bulk insert the CSVs into the database, however the only duplicate check is to see if the table exists in the database already. I need to check if the individual entries are already inside the database and only upload new entries.

Currently, I know I can solve this by iterating through the pandas dataframe, but since iterating through a dataframe is not usually recommended, I'm wondering if there is a more efficient way to solve this problem. Any suggestions?

Sean Payne
  • 1,625
  • 1
  • 8
  • 20
  • You could use an upsert-like procedure as described [here](https://stackoverflow.com/a/62388768/2144390) and only do the INSERT part. – Gord Thompson Sep 15 '20 at 12:54
  • Like @gordthompson points out in the linked Q/A, put the data in a temporary table, and insert from it selecting rows that do not exist in the destination table, or using `INSERT ... ON CONFLICT DO NOTHING`, a bit like [here](https://stackoverflow.com/questions/61064237/insert-or-update-bulk-data-from-dataframe-csv-to-postgresql-database). – Ilja Everilä Sep 18 '20 at 17:10

0 Answers0