0

I have a MySQL table that is constantly updated every hour. Given an existing MySQL table below:

  date      time    price   
11/6/2019   7:10     4.56  
11/6/2019   7:15     5.01 
11/6/2019   7:20     4.87 
11/6/2019   7:25     4.54 

Two questions:

  1. How can I avoid inserting duplicate rows if there is another entry on 11/6/2019 at 7:25 with the same 4.54 price?

  2. How can I insert the latest price if there is a new insert entry for 11/6/2019 at 7:25 with the latest price of 5.02? (this latest entry, same date/time but different price will replace the existing)

I am new to MySQL and I tried Googling but am unable to find any clear answers - it is making me more confused rather. It seems like in this case, using primary key/unique key is not an option.

I am using Pandas/sqlalchemy/MySQLdb. My code is below

import pandas as pd
import numpy as np
import MySQLdb
from sqlalchemy import create_engine

df = pd.read_csv('summary_1.csv')
price_df = df[['date','time','price']]

engine = create_engine('mysql+mysqldb://root:python@localhost:3306/testdb2', 
echo = False)

df.to_sql(name = 'price', con=engine, if_exists = 'append', index = True)

Anyone can enlighten me? Let me know if you need more information. Thank you!

SunnyBoiz
  • 514
  • 1
  • 5
  • 14
  • The job of a primary key is to guarantee unique rows. – Honeyboy Wilson Jun 16 '19 at 18:40
  • Make Date and time columns combination as your primary key and instead of Insert statement you can use Replace INTO, which will replace the old data with new one, if primary keys will get duplicate values. For more information please visit this https://dev.mysql.com/doc/refman/5.5/en/replace.html – Shubham Rajput Jun 16 '19 at 19:10
  • Use a staging, temp table then run with non-duplicate insert into final table using [NOT IN or NOT EXISTS or LEFT JOIN NULL](https://stackoverflow.com/a/2246793/1422451). – Parfait Jun 16 '19 at 19:58
  • Thanks, I will give these suggestions a try! – SunnyBoiz Jun 17 '19 at 22:32

0 Answers0