I am trying to load a SQL table from CSV which contains blank values in some of the columns. I am using Python 3.7 and PyODBC, pandas to perform this operation.
Below is my DF -
ID Name City Salary
1 ABC Pune 100.0
2 DEF NaN 200.0
3 GHI Mumbai 300.0
4 JKL NaN NaN
5 MNO Delhi NaN
and below is the code that I am using -
import pandas as pd
import pyodbc
conn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};Server=test;Port=test;database=test;UID=test;PWD=test')
cursor = conn.cursor()
df = pd.read_csv(r"E:\test\PyODBC NaN Handling\test.csv")
for row in df.itertuples():
cursor.execute('''
INSERT INTO test.dbo.test ([ID],[Name],[City],[Salary])
VALUES (?,?,?,?)
''',
row.ID, row.Name,row.City,row.Salary)
conn.commit()
This gives me the below error -
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]
The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 6 (""):
The supplied value is not a valid instance of data type float.
Check the source data for invalid values.
An example of an invalid value is data of numeric type with scale greater than precision. (8023) (SQLExecDirectW)')
Upon some research, I came to understand that PyODBC is not able to handle blank values and we need to use df.fillna() to replace blanks with some constant values (zero or space).
Can someone please help with any alternative solution to this problem that would insert Null values in the table without explicitly converting them to any constant value?
Thanks,
Sanket Kelkar