I have recently been trying to connect to my SQL database on Microsoft Azure. I am trying a newer approach that can upload multiple columns of data from a pandas dataframe. It's a new approach, because it is not something that I have done before.
I get the following error:
Traceback (most recent call last):
File "src/pymssql/_pymssql.pyx", line 460, in pymssql._pymssql.Cursor.execute
File "src/pymssql/_mssql.pyx", line 1084, in pymssql._mssql.MSSQLConnection.execute_query
File "src/pymssql/_mssql.pyx", line 1115, in pymssql._mssql.MSSQLConnection.execute_query
File "src/pymssql/_mssql.pyx", line 1248, in pymssql._mssql.MSSQLConnection.format_and_run_query
File "src/pymssql/_mssql.pyx", line 1786, in pymssql._mssql.check_cancel_and_raise
File "src/pymssql/_mssql.pyx", line 1832, in pymssql._mssql.raise_MSSQLDatabaseException
pymssql._mssql.MSSQLDatabaseException: (206, b'Operand type clash: int is incompatible with textDB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "sql_import.py", line 109, in <module>
cursor.execute(sql_insert_clean[0], sql_tuple)
File "src/pymssql/_pymssql.pyx", line 478, in pymssql._pymssql.Cursor.execute
pymssql._pymssql.OperationalError: (206, b'Operand type clash: int is incompatible with textDB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n')
Here's the script I am working with:
import pymysql
#Set up SQL server
connection = pymssql.connect("xxx.database.windows.net", "xxx", "xxx", "coiner")
cursor = connection.cursor()
""" Some processing done here that I will skip"""
--- ... ---
#get data into tuples
sql_data = tuple(map(tuple, sql_dataframe.values))
#store into SQL database
for sql_tuple in sql_data:
print(len(sql_tuple))
cursor.execute(sql_insert_clean[0], sql_tuple)
connection.commit()
sql_insert_clean[0]
looks like the following:
INSERT INTO StocksRealTime (change,changePercent,dayVolume,exchange,id,marketHours,price,priceHint,quoteType,time,lastSize,currency,circulatingSupply,dayHigh,dayLow,fromcurrency,lastMarket,marketcap,shortName,volAllCurrencies,vol_24hr) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)
It uploads at least 420 rows of data and on the 421 row it throws the error. I have checked the row and I cannot see what may be wrong with it.
Additionally, this process is extremely slow! it took about 1 minutes to upload 400 rows. My actual dataset is ~1m rows of data, so how would I improve the speed?
Example of my dataset around rows 439:442
:
439 440 441 442 types
change 5.279999 116.0298 7 39.59998 object
changePercent 2.06775 5.425476 3.660322 5.439557 object
dayVolume 54856566.0 7831598.0 40869414.0 56702940.0 float64
exchange NMS NMS NMS NMS object
id MSFT AMZN FB TSLA object
marketHours 1 1 1 1 int64
price 260.63 2254.64 198.24 767.6 object
priceHint 4 4 4 4 int64
quoteType 8 8 8 8 int64
time 3304942412000 3304942412000 3304942414000 3304942404000 int64
lastSize 200.0 0.0 0.0 0.0 float64
currency object
circulatingSupply 0.0 0.0 0.0 0.0 float64
dayHigh object
dayLow object
fromcurrency object
lastMarket object
marketcap 0.0 0.0 0.0 0.0 float64
shortName object
volAllCurrencies 0.0 0.0 0.0 0.0 float64
vol_24hr 0.0 0.0 0.0 0.0 float64