3

I have a pandas Dataframe, that I have put into sql with the below code.

code

df = pandas.read_csv(io.StringIO(r))
pandas.DataFrame.to_sql(df, name='Database count details', con=engine)

Sample data

UNIQUE id   name          refreshed_at    values
2449205     ABC           2014-01-10      22
26019260    DEF           2016-03-04      51
26019261    GHI (1333)    2016-03-04      0.55

My intentions are to run a code once a week and to replace the data in the .dbfile with that week's data if the unique id matches. If it doesn't match, it will append to the .db file.

How should I do this? Or is there a better way to perform this task?

jake wong
  • 4,909
  • 12
  • 42
  • 85

4 Answers4

4

Just delete the conflicting rows before you run to_sql. Here is how I update results of an experiment.

from sqlalchemy import select, delete, Table, MetaData
metadata = MetaData(engine)

experiment_name = 'base-20180608'
pipeline_proposals = Table('pipeline_proposals', metadata, autoload=True)
pipeline_proposals.delete().where(pipeline_proposals.c.experiment_name == experiment_name).execute()

proposals["experiment_name"]=experiment_name
proposals.to_sql("pipeline_proposals", con=engine, if_exists='append')
Piotr Czapla
  • 25,734
  • 24
  • 99
  • 122
0

There doesn't seem to be a feature to get through this easily. Currently, I just drop the entire table, and recreate a new one..

meta = MetaData()
table_to_drop = Table('Database count details', 
                       meta, autoload=True, autoload_with=engine)
table_to_drop.drop(engine)
jake wong
  • 4,909
  • 12
  • 42
  • 85
0
import sqlalchemy
engine = sqlalchemy.create_engine("sqlite:///" + path)
#Read new data, set index:
df_new = pandas.read_csv(io.StringIO(r))
df_new = df_new.set_index('indexName', drop=True)
#read old data, set index:
df_old = pandas.read_sql('table', engine)
df_old = df_old .set_index('indexName', drop=True)
#Combine with old comes first
df = pandas.concat([df_old, df_new ], copy=False)
#drop duplicate, keep last which is from df_new
df = df [~df .index.duplicated(keep='last')]
#replace old table with new one
df.to_sql('table', engine, if_exists="replace")
Brian Vu
  • 25
  • 3
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 24 '22 at 12:14
-4

pandas.DataFrame.to_sql(df, name='Database count details', con=engine, if_exists='replace')

See the if_exists parameter for to_sql at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html

cpep
  • 141
  • 1
  • 4
  • 7
    Attention here: if_exists refers to TABLES, not to single rows. If the table exists, it recreates the whole table. – Alex Poca Dec 14 '17 at 13:03