I want to keep a database constantly updated with information that I scrape from an API. The data I get may be incomplete but I should have most of it. So far I have a try/except clause where I try inserting a row in my database and on except I update the row. The main problem is that I don't delete any rows. I want to have a copy of the server's data at any given time, or at least stay close to it. I need to somehow keep track of the rows I need to delete over time because I want to make sure it's not just the scraper that's giving me incomplete data. By the way I'm using Python and psycopg2. I'm think that this is a common problem but I can't find a better solution that creating a new database, updating it a couple times with what I currently have and then replace the databases. Any suggestions? I also don't like the fact that I expect the exception clause to get triggered often here....
Thanks in advance!