dataframe.to_sql index as primary key in postgresql

Question

I have a dataframe with an index that I want to store in a postgresql database. For this I use df.to_sql(table_name,engine,if_exists='replace', index=True,chunksize=10000)

The index column from the pandas dataframe is copied to the database but is not set as primary key.

There are two solutions that require an additional step:

specify a schema df.to_sql(schema=) docs
Set the primary key after the table is ingested. query:

ALTER TABLE table_name ADD PRIMARY KEY (id_column_name)

Is there a way to set the primary key without specifying the schema or altering the table?

[This question has a couple suggestions for setting the primary key in a MySQL table, which would likely apply to postgresql as well](https://stackoverflow.com/questions/30867390/python-pandas-to-sql-how-to-create-a-table-with-a-primary-key) However, it would appear that simply adding a line to to use the ALTER TABLE would be the fastest and easiest way to set a primary key. — T. Kelly, May 22 '18 at 16:23

Mehmet Burak Sayıcı · Answer 1 · 2021-01-13T13:25:46.867

2

After calling to_sql:

import sqlalchemy
engine = create_engine()
engine.execute('ALTER TABLE schema.table ADD PRIMARY KEY (keycolumn);')

Unfortunately, pandas.to_sql doesn't set primary key, it even also destructs the primary key of existing table. One must aware for the primary keys.

edited Jan 13 '21 at 13:25

answered Dec 28 '20 at 14:17

Mehmet Burak Sayıcı

452
1
4
13

dataframe.to_sql index as primary key in postgresql

1 Answers1

Linked