pandas.to_sql replace old data with new data based on 'unique id'

Question

I have a pandas Dataframe, that I have put into sql with the below code.

code

df = pandas.read_csv(io.StringIO(r))
pandas.DataFrame.to_sql(df, name='Database count details', con=engine)

Sample data

UNIQUE id   name          refreshed_at    values
2449205     ABC           2014-01-10      22
26019260    DEF           2016-03-04      51
26019261    GHI (1333)    2016-03-04      0.55

My intentions are to run a code once a week and to replace the data in the .dbfile with that week's data if the unique id matches. If it doesn't match, it will append to the .db file.

How should I do this? Or is there a better way to perform this task?

Here is a possible solution. [REPLACE rows in mysql database table with pandas DataFrame](http://stackoverflow.com/a/34802840/848930) — Quazi Marufur Rahman, Mar 06 '16 at 04:41
There isn't any feature in sqlalchemy or pandas that can do this easily? — jake wong, Mar 06 '16 at 12:19

score 4 · Answer 1 · answered Jun 08 '19 at 19:23

Just delete the conflicting rows before you run to_sql. Here is how I update results of an experiment.

from sqlalchemy import select, delete, Table, MetaData
metadata = MetaData(engine)

experiment_name = 'base-20180608'
pipeline_proposals = Table('pipeline_proposals', metadata, autoload=True)
pipeline_proposals.delete().where(pipeline_proposals.c.experiment_name == experiment_name).execute()

proposals["experiment_name"]=experiment_name
proposals.to_sql("pipeline_proposals", con=engine, if_exists='append')

probably should do both steps within a single session – strider Mar 05 '20 at 21:10 — strider, Mar 05 '20 at 21:10

score 0 · Accepted Answer · answered Mar 06 '16 at 14:14

There doesn't seem to be a feature to get through this easily. Currently, I just drop the entire table, and recreate a new one..

meta = MetaData()
table_to_drop = Table('Database count details', 
                       meta, autoload=True, autoload_with=engine)
table_to_drop.drop(engine)

score 0 · Answer 3 · answered May 24 '22 at 09:54

import sqlalchemy
engine = sqlalchemy.create_engine("sqlite:///" + path)
#Read new data, set index:
df_new = pandas.read_csv(io.StringIO(r))
df_new = df_new.set_index('indexName', drop=True)
#read old data, set index:
df_old = pandas.read_sql('table', engine)
df_old = df_old .set_index('indexName', drop=True)
#Combine with old comes first
df = pandas.concat([df_old, df_new ], copy=False)
#drop duplicate, keep last which is from df_new
df = df [~df .index.duplicated(keep='last')]
#replace old table with new one
df.to_sql('table', engine, if_exists="replace")

As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). — Community, May 24 '22 at 12:14

score -4 · Answer 4 · answered Sep 14 '16 at 14:45

-4

pandas.DataFrame.to_sql(df, name='Database count details', con=engine, if_exists='replace')

See the if_exists parameter for to_sql at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html

answered Sep 14 '16 at 14:45

cpep

141
1
4

7

Attention here: if_exists refers to TABLES, not to single rows. If the table exists, it recreates the whole table. – Alex Poca Dec 14 '17 at 13:03

pandas.to_sql replace old data with new data based on 'unique id'

4 Answers4