How to insert a pandas dataframe to an already existing table in a database?

Question

I'm using sqlalchemy in pandas to query postgres database and then insert results of a transformation to another table on the same database. But when I do df.to_sql('db_table2', engine) I get this error message: ValueError: Table 'db_table2' already exists. I noticed it want to create a new table. How to insert pandas dataframe to an already existing table ?

df = pd.read_sql_query('select * from "db_table1"',con=engine)
#do transformation then save df to db_table2
df.to_sql('db_table2', engine)

ValueError: Table 'db_table2' already exists

MaxU - stand with Ukraine · Accepted Answer · 2016-07-27T10:57:48.287

82

make use of if_exists parameter:

df.to_sql('db_table2', engine, if_exists='replace')

or

df.to_sql('db_table2', engine, if_exists='append')

from docstring:

"""
if_exists : {'fail', 'replace', 'append'}, default 'fail'
    - fail: If table exists, do nothing.
    - replace: If table exists, drop it, recreate it, and insert data.
    - append: If table exists, insert data. Create if does not exist.
"""

edited Jul 27 '16 at 10:57

answered Jul 27 '16 at 10:44

MaxU - stand with Ukraine

205,989
36
386
419

3

it appears its dropping the table each time. I wish to update the table, not recreate with new dataframe – DevEx Jul 27 '16 at 10:48
its an incremental insert i.e. a daily job that updates the database table – DevEx Jul 27 '16 at 10:53
@DevEx, what about using `if_exists='append'`? If it's a complicated case - you would have to make an example – MaxU - stand with Ukraine Jul 27 '16 at 10:55
2

`append` works! But I have a primary key on the table which is auto incremented. (postgres `SERIAL`) I noticed the key resets when I `append`. – DevEx Jul 27 '16 at 11:05
@MaxU-stopgenocideofUA `schema`, an important parameter to consider, if default schema is not `dbo` – nikn8 Nov 23 '22 at 09:54
What if the datatypes between the existing and the newly generated tables differ from eachother? Wouldn't this completely regenerate the table? – Soerendip Mar 25 '23 at 18:00
@Soren, Pandas will recreate the table *only* if you specify `if_exists='replace'`. If you specify `append`, then your DB driver will try to cast the data to target data types - if this is not possible, then it should throw an exception – MaxU - stand with Ukraine Mar 26 '23 at 10:37

JiangKui · Answer 2 · 2019-12-24T08:33:28.193

Zen of Python:

Explicit is better than implicit.

df.to_sql(
    name,# Name of SQL table.
    con, # sqlalchemy.engine.Engine or sqlite3.Connection
    schema=None, # Something can't understand yet. just keep it.
    if_exists='fail', # How to behave if the table already exists. You can use 'replace', 'append' to replace it.
    index=True, # It means index of DataFrame will save. Set False to ignore the index of DataFrame.
    index_label=None, # Depend on index. 
    chunksize=None, # Just means chunksize. If DataFrame is big will need this parameter.
    dtype=None, # Set the columns type of sql table.
    method=None, # Unstable. Ignore it.
)

So, I recommend this example, normally:

df.to_sql(con=engine, name='table_name',if_exists='append', dtype={
    'Column1': String(255),
    'Column2': FLOAT,
    'Column3': INT,
    'createTime': DATETIME},index=False)

Set the sql table Primary Key manually(like: Id) and check increment in Navicat or MySQL Workbench.

The Id will increment automatically.

The Docstring of df.to_sql:

Parameters
----------
name : string
    Name of SQL table.
con : sqlalchemy.engine.Engine or sqlite3.Connection
    Using SQLAlchemy makes it possible to use any DB supported by that
    library. Legacy support is provided for sqlite3.Connection objects.
schema : string, optional
    Specify the schema (if database flavor supports this). If None, use
    default schema.
if_exists : {'fail', 'replace', 'append'}, default 'fail'
    How to behave if the table already exists.

    * fail: Raise a ValueError.
    * replace: Drop the table before inserting new values.
    * append: Insert new values to the existing table.

index : bool, default True
    Write DataFrame index as a column. Uses `index_label` as the column
    name in the table.
index_label : string or sequence, default None
    Column label for index column(s). If None is given (default) and
    `index` is True, then the index names are used.
    A sequence should be given if the DataFrame uses MultiIndex.
chunksize : int, optional
    Rows will be written in batches of this size at a time. By default,
    all rows will be written at once.
dtype : dict, optional
    Specifying the datatype for columns. The keys should be the column
    names and the values should be the SQLAlchemy types or strings for
    the sqlite3 legacy mode.
method : {None, 'multi', callable}, default None
    Controls the SQL insertion clause used:

    * None : Uses standard SQL ``INSERT`` clause (one per row).
    * 'multi': Pass multiple values in a single ``INSERT`` clause.
    * callable with signature ``(pd_table, conn, keys, data_iter)``.

    Details and a sample callable implementation can be found in the
    section :ref:`insert method <io.sql.method>`.

    .. versionadded:: 0.24.0

That's all.

can you elaborate on `method=None, # Unstable. Ignore it.` – dangel May 16 '23 at 00:32 — dangel, May 16 '23 at 00:32

How to insert a pandas dataframe to an already existing table in a database?

2 Answers2

Linked