I want to add a dataframe to my DB that contains some duplicate rows, but because my real database is very big, I don't want to drop the existing table and add it again with the updated rows, as it require me to pull it again from an API
I think the correct aproach could be to df.read_sql() and then compare the result with df2 and then only insert the rows which are not allready existing
import sqlalchemy as db
import pandas as pd
engine = db.create_engine('sqlite:///test.db', echo = True)
data1 = {'Month':['June', 'July', 'August'],
'Number':[20, 21, 19]}
df = pd.DataFrame(data1)
df.to_sql("ExampleTable", engine,if_exists="append", index=False)
data2 = {'Month':['May','June', 'July', 'August', 'Septemper'],
'Number':[11, 20, 21, 19, 14, 15]}
df2 = pd.DataFrame(data2)
df2.to_sql("ExampleTable", engine,if_exists="append", index=False)