0

I want to keep a database constantly updated with information that I scrape from an API. The data I get may be incomplete but I should have most of it. So far I have a try/except clause where I try inserting a row in my database and on except I update the row. The main problem is that I don't delete any rows. I want to have a copy of the server's data at any given time, or at least stay close to it. I need to somehow keep track of the rows I need to delete over time because I want to make sure it's not just the scraper that's giving me incomplete data. By the way I'm using Python and psycopg2. I'm think that this is a common problem but I can't find a better solution that creating a new database, updating it a couple times with what I currently have and then replace the databases. Any suggestions? I also don't like the fact that I expect the exception clause to get triggered often here....

Thanks in advance!

Max Smith
  • 925
  • 1
  • 14
  • 25

1 Answers1

1

Lacking an upsert (equivalent to MySQL INSERT ... ON DUPLICATE KEY UPDATE) has been a thorn in Postgresql's side for a long time. Generally, your approach is the best way to do it. However, there is an issue in that it's not atomic -- between the time your exception throws and you try and update, the row may have been updated by another process. Often times, this leads people to build immutable rows, but that's another topic.

It appears as of Postgres 9.5 they added an Upsert clause. INSERT ... ON CONFLICT ... DO ... (Documentation here).

In doing other research, it appears there's a far more comprehensive answer here: https://stackoverflow.com/a/17267423/1327710.

Community
  • 1
  • 1
Ned Rockson
  • 1,095
  • 7
  • 16
  • Upsert seems Nice but it doesn't manage deletions? Also I'd like to add a id column to my database. This will be the only unique column so there won't be a conflict to upset? Like the data I get has no ID. I will assign one to it when it is first inserted. But then I guess I need to match rows and somehow throw a conflict to update? Thanks for your help! – Max Smith Jan 11 '17 at 15:36