4

I am trying to export my dataframe to sql database (Postgres).

I created the table as following:

CREATE TABLE dataops.OUTPUT
(
    ID_TAIL CHAR(30) NOT NULL,
    ID_MODEL CHAR(30) NOT NULL,
    ID_FIN CHAR(30) NOT NULL,
    ID_GROUP_FIN CHAR(30) NOT NULL,
    ID_COMPONENT CHAR(30) NOT NULL,
    DT_OPERATION TIMESTAMP NOT NULL,
    DT_EXECUTION TIMESTAMP NOT NULL,
    FT_VALUE_SENSOR FLOAT NOT NULL,
    DT_LOAD TIMESTAMP NOT NULL
);

And I want to write this dataframe into that sql table:

conn = sqlalchemy.create_engine("postgres://root:1234@localhost:5432/postgres")
data = [['ID_1',  'A4_DOOUE_ADM001',  '1201MJ52',  'PATH_1',  'LATCHED1AFT',
         '2016-06-22 19:10:25',  '2020-11-12 17:20:33.616016',  2.9,  '2020-11-12 17:54:06.340735']]

output_df=pd.DataFrame(data,columns=["id_tail", "id_model", "id_fin", "id_group_fin", "id_component", "dt_operation",
                                             "dt_execution", "ft_value_sensor", "dt_load"])

But, when I run the command to write into database output_df.to_sql I realize that a new table "OUTPUT", with double qupotes has been created with the data inserted.

output_df.to_sql(cfg.table_names["output_rep27"], conn, cfg.db_parameters["schema"], if_exists='append',index=False)

This is what I see in my DDBB: enter image description here

But the same table without quotes is empty: enter image description here

When you purposely try to insert the table wrong (changing a column name for example) you see that pandas is inserting with double quotes because the error: enter image description here

How to avoid pandas inserts with double quotes for the table?

Henry Navarro
  • 943
  • 8
  • 34
  • Force the table name to lower case `cfg.table_names["output_rep27"].lower()`. That would not turn off the double quoting, but would result in all lower case table names. – Adrian Klaver Nov 12 '20 at 17:10
  • May be you misunderstood, the problem is not uppercase vs lowercase, the problem is trying to write with double quotes, I have tried to modify the font code but I have failed until the moment – Henry Navarro Nov 12 '20 at 17:15

6 Answers6

0

Short version Pandas is double quoting identifiers which is fairly standard. When that happens with upper case identifier you have to double quote from then on when using it. Using it unquoted will fold the name to lower case and you won't find the table. For more information on this, see Identifier Syntax. You have three choices, do as I suggested in comment and force name to lower case, always double quote identifiers when using them or modify Panda source code to not double quote.

Adrian Klaver
  • 15,886
  • 2
  • 17
  • 28
  • Do you know where the double quotes are added? – Henry Navarro Nov 12 '20 at 17:31
  • No and I would argue strongly against modifying the source for a couple of reasons. 1) You will be forever have to be doing that. 2) It is probably going to create a new issue. Namely that when you do `CREATE TABLE SOME_UPPER_CASE_NAME` the name will become `some_upper_case_name`. Without the quoting it will get case folded down. Read the link I posted in my answer, it should make things clearer. – Adrian Klaver Nov 12 '20 at 17:38
  • The problem is not about pandas but sqlalchemy, that's the code I modified, it worked. – Henry Navarro Nov 12 '20 at 17:49
  • Except now you won't have an upper/mixed case table name but an all lower case one. And you have lost the ability to have upper/mixed case names. – Adrian Klaver Nov 13 '20 at 15:24
0

I found the same question and here is the accepted answer for it

We need to set the dataframe column into lower case before we send it to PostgreSQL, and set a lower cased table name for the table, so we don't need to add double quotes when we select the table or columns

*EDIT : I found out that whitespace also force to_sql function from pandas to write the table or column name using double quotes in PostgreSQL, so if you wanna make the table or column name double-quotes-free, change the whitespaces into non-whitespace characters or just delete the whitespaces from the table name or column name

this is the example from my own case:

import pandas as pd
import re
from sqlalchemy import create_engine

df = pd.read_excel('data.xlsx')
ws = re.compile("\s+")

# lower the case, strip leading and trailing white space,
# and substitute the whitespace between words with underscore
df.columns = [ws.sub("_", i.lower().strip()) for i in df.columns]

my_db_name = 'postgresql://postgres:my_password@localhost:5432/db_name' 
engine = create_engine(my_db_name) 
df.to_sql('lowercase_table_name', engine) #use lower cased table name
  • No, it should work even with uppercase, so that's not the problem. – Henry Navarro Aug 31 '21 at 10:37
  • @Henry Navarro I don't really get the point from your comment (maybe you can explain your comment so I can response properly), but **this solution will makes the resulting table name and column name clean (selecting without using double quotes) and case insensitive**, so you can select your resulting table and columns with lower, upper or mixed letter case as long as the letter match the columns or the table name – firli_drako Sep 01 '21 at 13:20
  • Basically the solution is to just set everything to lowercase? – Qohelet Nov 26 '21 at 17:32
0

this line of code worked for me

appended_data.columns = map(str.lower, df2.columns)
appended_data.to_sql('table_name', con=engine, 
              schema='public', index=False, if_exists='append',method='multi')
0

You need to use large letters in pandas in order to get names without quotes in SQL table. Use this code on your df.

 df.columns.str.upper()
-1

I didn't found a "good" solution, so what I did was to create my own function to insert the values:

import sqlalchemy
import pandas as pd

conn = sqlalchemy.create_engine("postgres://root:1234@localhost:5432/postgres")
data = [['ID_1',  'A4_DOOUE_ADM001',  '1201MJ52',  'PATH_1',  'LATCHED1AFT',
         '2016-06-22 19:10:25',  '2020-11-12 17:20:33.616016',  2.9,  '2020-11-12 17:54:06.340735']]

output_df=pd.DataFrame(data,columns=["id_tail", "id_model", "id_fin", "id_group_fin", "id_component", "dt_operation",
                                             "dt_execution", "ft_value_sensor", "dt_load"])
    
def to_sql(output_df,table_name,conn,schema):
        my_query = 'INSERT INTO '+schema+'.'+table_name+' ('+", ".join(list(output_df.columns))+') \
                    VALUES ('+ ", ".join(np.repeat('%s',output_df.shape[1]).tolist()) +');'
        record_to_insert = output_df.applymap(str).values.tolist()
        conn.execute(my_query,record_to_insert)

to_sql(output_df,table_name,conn,schema)

I hope it is useful for somebody

Henry Navarro
  • 943
  • 8
  • 34
  • Hi, i tried to run your function instead of .to_sql(), but to throws an error LINE 1: INSERT INTO public.test_database (kind,collectionName,trackN.... However, I am running without declaring schema. – PatrickHellman Dec 15 '21 at 12:50
  • Were you able to find any working solutions? – PatrickHellman Dec 15 '21 at 13:01
  • @PatrickHellman what do you mean with "working solutions"? I cannot see your database so I don't know exactly what you mean. In my case this function worked for me. – Henry Navarro Dec 16 '21 at 13:03
-1

For those, who is still looking for the answer.

Instead of writing

output_df.to_sql(name='some_schema.some_table', con=conn)

you should put schema into corresponding to_sql() parameter

output_df.to_sql(name='some_table', schema='some_schema', con=conn)

Otherwise 'some_schema.some_table' will be considered as single table name and enquoted.