2

This is a continuation of this situation: How does Snowflake handle NULL values?

I am trying to insert the dataframe, into a temp table that was created during the session I created with the python connector and cannot insert values into a table where the dataframe is not set yet. How can I add a column of blank NaN and Null values that I can set in the table later?

conn.cursor().execute("create or replace temp table x as")

>>> conn.cursor().execute("USE DATABASE temp_db;")

<snowflake.connector.cursor.SnowflakeCursor object at 0x10c78b048>

>>> conn.cursor().execute("create or replace temp table x(id number, first_name varchar, last_name varchar, email varchar, null_feild boolean, blank_feild varchar, letter_grade varchar(3));")

<snowflake.connector.cursor.SnowflakeCursor object at 0x10acebc88>
>>> df.to_sql('x', con=conn, index=False)
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/generic.py:2712: UserWarning: The spaces in these column names will not be changed. In pandas versions < 0.14, spaces were converted to underscores.
  method=method,
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
    cur.execute(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/snowflake/connector/cursor.py", line 490, in execute
    query = command % processed_params
TypeError: not all arguments converted during string formatting

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/generic.py", line 2712, in to_sql
    method=method,
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 518, in to_sql
    method=method,
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1749, in to_sql
    table.create()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 641, in create
    if self.exists():
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 628, in exists
    return self.pd_sql.has_table(self.name, self.schema)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1762, in has_table
    return len(self.execute(query, [name]).fetchall()) > 0
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1610, in execute
    raise_with_traceback(ex)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/compat/__init__.py", line 47, in raise_with_traceback
    raise exc.with_traceback(traceback)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
    cur.execute(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/snowflake/connector/cursor.py", line 490, in execute
    query = command % processed_params
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': not all arguments converted during string formatting
>>> 

The dataframe is below: 

>>> df['letter_grade'] = np.nan
>>> df.head()
   id first_name last_name  ... null field  blank_ield  letter_grade
0   1      Paule    Tohill  ...      False         NaN           NaN
1   2       Rebe   Slyford  ...       True         NaN           NaN
2   3   Angelita    Antoni  ...      False         NaN           NaN
3   4      Giffy      Dehm  ...      False         NaN           NaN
4   5        Rob    Beadle  ...      False         NaN           NaN

[5 rows x 7 columns]
>>> df.to_sql('x', con=conn, index=False)
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
    cur.execute(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/snowflake/connector/cursor.py", line 490, in execute
    query = command % processed_params
TypeError: not all arguments converted during string formatting

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/generic.py", line 2712, in to_sql
    method=method,
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 518, in to_sql
    method=method,
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1749, in to_sql
    table.create()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 641, in create
    if self.exists():
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 628, in exists
    return self.pd_sql.has_table(self.name, self.schema)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1762, in has_table
    return len(self.execute(query, [name]).fetchall()) > 0
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1610, in execute
    raise_with_traceback(ex)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/compat/__init__.py", line 47, in raise_with_traceback
    raise exc.with_traceback(traceback)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
    cur.execute(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/snowflake/connector/cursor.py", line 490, in execute
    query = command % processed_params
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': not all arguments converted during string formatting
>>> 

Obviously I don't want this to be a temp database, after the first test scores I will alter the table, I am just not sure why the connection does not like the current dataframe based on the table definition above.

1 Answers1

1

The error thrown here is at a point well before the actual values (such as NaN/None) are evaluated. Before executing the inserts, Pandas runs checks to see if the table exists or if it needs to be created, which is the part that's specifically failing according to the traceback (contains calls to exists, has_table etc.).

To use Panda's to_sql function against a Snowflake DB, ensure you're passing it an actual Snowflake DB SQLAlchemy Engine object and not a generic one.

For connection objects passed to to_sql that are not of SQLAlchemy Engine type, Pandas only supports SQLite3 dialects, which can be observed in the error (the sqlite_master table is not valid for Snowflake DBs, it is only valid for SQLite3 DBs):

pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': not all arguments converted during string formatting

Follow this Snowflake documentation guide to install an SQLAlchemy engine for Snowflake DB, then rebuild the parts of the code that create the SQLAlchemy engine object. The Verifying Your Installation section in the guide has a code sample that makes use of the snowflake:// URI support:

engine = create_engine(
  'snowflake://{user}:{password}@{account}/'.format(
    user='<your_user_login_name>',
    password='<your_password>',
    account='<your_account_name>',
  )
)

Note 1: The SQLAlchemy support does not come with the standard Snowflake Python Connector installation and needs to be installed as an add-on for Pandas to make use of it.

Note 2: Support for NaN and NULL value inserts to databases is present in recent versions of Pandas, covered by another question: Python Pandas write to sql with NaN values

Harsh J
  • 666
  • 4
  • 7
  • Thank you, it also turned out that I was indeed using the import snowflake.connector instead of sqlalchemy, create_engine. I fixed that an this helped alot - the documentation here: https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html also had examples for numpy, thanks again! – Pumpkincloud Jan 16 '20 at 21:13