4

I am writing a pandas Dataframe to a redshift database using pandas.to_sql() as so:

from sqlalchemy import create_engine
import pandas as pd

conn = create_engine('postgresql://user:password@redshift-url.amazonaws.com:5439/db')

dataframe.to_sql('table_name', conn, index=False, if_exists='replace', schema='schema_name')

I this dataframe I have timestamp column which looks like this 2020-03-02, but when I write it to db it gets stored as a varchar and I cant use this column to plot graphs which require timestamp format.

Is there a way to define the column type when writing data to a db with pandas.to_sql()? I do not want to create a view or another table out of the one I am writing because then I would need to drop the current table as the other one would be related to the original one and I would get an error cant drop table because other objects depend on it.

Jonas Palačionis
  • 4,591
  • 4
  • 22
  • 55
  • 1
    You could use `sqlalchemy.types` and specify a schema dictionary as `dtype` to the `pd.to_sql` function, check the accepted answer in this link - [pandas to_sql all columns as nvarchar](https://stackoverflow.com/questions/34383000/pandas-to-sql-all-columns-as-nvarchar) Check [here](https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/types.py) for supported sqlalchemy types. – tidakdiinginkan May 06 '20 at 09:05

1 Answers1

5

use sqlalchemy.types

for your time stamp column:

from sqlalchemy import types 

sql_types =  {'date' : types.TIMESTAMP(0)}

then change your dataframe.to_sql call to include the following arguments.

dataframe.to_sql('table_name', conn, index=False, 
                if_exists='replace', schema='schema_name',
                dtype=sql_types)
Umar.H
  • 22,559
  • 7
  • 39
  • 74