3

I have a simple dataframe like this

df = pd.DataFrame({"A":[1, 2, 3], "B":["a", "b", "c"]})

I would like to write this dataframe to vertica database using to_sql method. So I use vertica_python module and my code is the following

import pandas as pd
import vertica_python

cxn = {"user":'myuser',
       "password":'mypassword',
       "host":'xx.x.x.xx',
       "port":yyyy,
       "database":"mydb"}

engine = vertica_python.connect(**cxn)

df = pd.DataFrame({"A":[1, 2, 3], "B":["a", "b", "c"]})

df.to_sql("df", index=False, if_exists="replace", con=engine, schema="public", dtype={"A":"int", "B":"int"})

Then i got database error which i could not fix it as follows.

DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': not all arguments converted during string formatting

May I have your suggestions how to solve this problem? Thank you very much.

1 Answers1

1

I bumped into a similar problem connecting to Vertica and managed to create a connection via the following:

  1. Using sqlalchemy and specifically sqlalchemy-vertica

As you seem to be using vertica_python, I would recommend installing with:

pip install sqlalchemy-vertica[vertica-python]

This way you can connect the following way:

import sqlalchemy as sa
import vertica_python

engine = sa.create_engine('vertica+vertica_python://user:pwd@host:port/database')

Note that according to this thread, the upload with pandas and sqlalchemy can become a lot faster due to a recent improvement in sqlalchemy by passing the following configuration to create_engine:

engine = create_engine(sqlalchemy_url, fast_executemany=True)

I haven't tried it yet, but it looks promising. More on it on that answer.

EDIT

I tried the fast_executemany flag above and unfortunately it does not work with Vertica.

realr
  • 3,652
  • 6
  • 23
  • 34