0

I get the Weather forecast for my City with an API and write this data to a mysql database. For this i use panda and a dataframe. This Dataframe has 40 entries: 8 a day for the next 5 days. Every 3 hours, there is a new entry generated from the website. My program repeats itself every minute.

    #get data
    data = urllib.request.urlopen("http://api.openweathermap.org/data/2.5/forecast?id=2867714&APPID=fddaaa343ef234b26418093023a6d2a8")
    data = json.loads(data.read())
    df=json_normalize(data['list'])
    df['city']=data['city']['name']
    
    #select data
    df = df[['dt_txt','main.temp','main.pressure','main.humidity','city']]

    #write to mysql database
    engine = create_engine('mysql+mysqlconnector://root:XXXXXXXX@localhost/munich_temp')
    df.to_sql(name='sheet', con=engine, if_exists = 'replace', index=False)

This code replaces the table every time.

Is there a way to check if the last timestamp of my dataframe already exists? If it doesn't exist i want to add one entry to the mysql table and overwrite the previous 39 entries. In the end it should generate an infinite database with Weather forecasts and no duplicates.

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
Flaayor
  • 47
  • 4

1 Answers1

0

The first thing to mention is the function call df.to_sql(name='sheet', con=engine, if_exists = 'replace', index=False). With if_exists set to replace your complete table will be dropped and re-created every time you call this function. to_sql - Documentation

Your program runs every minute so every minute you recreate your table. You can simply set if_exists to append and the complete dataframe will be added to your table. If you do so you just need to remove duplicates from your table. Not a pretty solution but simple. remove duplicates from table

yabberth
  • 507
  • 4
  • 9