5

I have a dataframe with an index that I want to store in a postgresql database. For this I use df.to_sql(table_name,engine,if_exists='replace', index=True,chunksize=10000)

The index column from the pandas dataframe is copied to the database but is not set as primary key.

There are two solutions that require an additional step:

  1. specify a schema df.to_sql(schema=) docs
  2. Set the primary key after the table is ingested. query:

    ALTER TABLE table_name ADD PRIMARY KEY (id_column_name)

Is there a way to set the primary key without specifying the schema or altering the table?

Rutger Hofste
  • 4,073
  • 3
  • 33
  • 44
  • 2
    [This question has a couple suggestions for setting the primary key in a MySQL table, which would likely apply to postgresql as well](https://stackoverflow.com/questions/30867390/python-pandas-to-sql-how-to-create-a-table-with-a-primary-key) However, it would appear that simply adding a line to to use the ALTER TABLE would be the fastest and easiest way to set a primary key. – T. Kelly May 22 '18 at 16:23

1 Answers1

2

After calling to_sql:

import sqlalchemy
engine = create_engine()
engine.execute('ALTER TABLE schema.table ADD PRIMARY KEY (keycolumn);')

Unfortunately, pandas.to_sql doesn't set primary key, it even also destructs the primary key of existing table. One must aware for the primary keys.