1

The purpose of my below code is to get data from a restful service, normalize it, store it in dataframe with necessary columns and then finally load it in Sybase table using Pandas' to_sql.

Error :

File "C:\Program Files\Anaconda3\lib\site-packages\sqlalchemy\engine\default.py", line 467, in do_executemany cursor.executemany(statement, parameters) sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] [Sybase][ODBC Driver][Adaptive Server Enterprise]Incorrect syntax near ','.\n (102) (SQLExecDirectW)") [SQL: 'INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") VALUES (?, ?, ?, ?)'] [parameters: (('0050/TAIEX', 'TAIEX', 'TWD', 0), ('035420/KORE', 'KORE', 'KRW', 0), ('0TL/LIF', 'LIF', 'NOK', 1), ('100FTSE/LIF', 'LIF', 'GBP', 0), ('101FTSE/LIF', 'LIF', 'GBP', 0), ('10STAT/OM', 'OM', 'SEK', 0), ('10TB/KFX', 'KFX', 'KRW', 0), ('10TBA/KFX', 'KFX', 'KRW', 0) ... displaying 10 of 4525 total bound parameter sets ... ('ZURF/DTB', 'DTB', 'CHF', 0), ('ZX/NYCE', 'NYCE', 'USD', 0))]

Process finished with exit code 1

Code :

from sqlalchemy.engine.url import *               
from sqlalchemy.connectors.pyodbc import *             
from sqlalchemy import create_engine                       
import urllib.request as request                  
import json                         
import pandas as pd                      
from pandas.io.json import json_normalize, DataFrame      
           
response = request.urlopen('http://tfsdscsw5XX/mdsclass/CONTFUTURES--O.json')            
output=response.read()                              
data=json.loads(output)           
df=json_normalize(data)                           
df1=(df[['CONTRACT_ID','EXCHANGE_ID','CURRENCY','TRADING_CODE']])                
df2=pd.DataFrame(df1)           
print(df2)                
print(df2.CONTRACT_ID)          
            
connector =  PyODBCConnector()                 
url = make_url("sybase+pyodbc://myhost/mydatabase?driver=Adaptive Server Enterprise&port=2306")              
print(connector.create_connect_args(url))                         
engine=create_engine(url)

#it is failing here**
df2.to_sql("contract_test",engine,index=False,if_exists="append",schema="dbo")   

response.close()               

Sample of data in dataframe df2 :

      CONTRACT_ID EXCHANGE_ID CURRENCY  TRADING_CODE
0      0050/TAIEX       TAIEX      TWD             0
1     035420/KORE        KORE      KRW             0
2         0TL/LIF         LIF      NOK             1
3     100FTSE/LIF         LIF      GBP             0
4     101FTSE/LIF         LIF      GBP             0

Table contract_test definition :

CREATE TABLE contract_test (
    CONTRACT_ID char(12) NOT NULL,
    EXCHANGE_ID char(12),
    CURRENCY char(4) NOT NULL,
    TRADING_CODE smallint
) 
GO

Please help as to how can this be resolved ? I am stuck here.

Parfait
  • 104,375
  • 17
  • 94
  • 125
jmenghan
  • 11
  • 3

2 Answers2

0

Your issue may simply be the incompatibilities of Python database APIs. Pandas' to_sql is really running an executemany() call from pyodbc. This module is more popularly used with SQL Server especially in implementation with SQLAlchemy. However, integration with Sybase is not fully supported. As mentioned on the SQLAlchemy Sybase docs page:

Note

The Sybase dialect within SQLAlchemy is not currently supported. It is not tested within continuous integration and is likely to have many issues and caveats not currently handled. Consider using the external dialect instead.

Specifically, executemany appears to be running multiple VALUES row inserts which is supported in SQL Server but not Sybase (even though both dialects are variants of TSQL with known history of connection):

INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0050/TAIEX', 'TAIEX', 'TWD', 0), 
       ('035420/KORE', 'KORE', 'KRW', 0), 
       ('0TL/LIF', 'LIF', 'NOK', 1), 
...

Instead, Sybase requires classic ANSI-SQL with multiple INSERT INTO calls:

INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0050/TAIEX', 'TAIEX', 'TWD', 0) 
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('035420/KORE', 'KORE', 'KRW', 0)
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0TL/LIF', 'LIF', 'NOK', 1)
...

To resolve, instead of Pandas' convenient to_sql method, consider a direct SQLAlchemy executemany call with parameters using list of data frame rows via DataFrame.to_numpy(). Below assumes contract_test table always exists beforehand.

engine = create_engine(url)
sql = """INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
         VALUES (?, ?, ?, ?)"""

with engine.connect() as connection:
    result = connection.execute(sql, df2.to_numpy().tolist())

If above still faces same issue, integrate a for-loop:

with engine.connect() as connection:
    for row in df2.to_numpy().tolist():
        result = connection.execute(sql, row)
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • I have used the code as above suggested above, but getting below error : result = connection.executemany(sql, row) AttributeError: 'Connection' object has no attribute 'executemany' Process finished with exit code 1 – jmenghan Jul 02 '20 at 12:03
  • Try running with `execute`. See edit. Docs indicate the DB-API will detect by parameters (list/tuple of iterables or just single iterable) when to use `executemany`. – Parfait Jul 02 '20 at 14:11
  • It now works with execute, but is super slow and taking 12 mins to insert 4500 rows in the sybase database table. How to resolve this performance issue ? – jmenghan Jul 02 '20 at 15:30
  • Which version of `execute` (with or without for-loop)? Try a non-SQLAlchemy version with simple [pyodbc connection](https://github.com/mkleehammer/pyodbc/wiki/Cursor). Use `cursor.executemany`. – Parfait Jul 02 '20 at 15:37
  • It is still tremendously slow with pyodbc. I tried below combinations: 1) pyodbc : execute and executemany 2) sqlalhemy : execute and executemany . Have tried both 1 and 2 with and without for loop. It is taking 9 minutes to insert 4500 rows. :( – jmenghan Jul 03 '20 at 08:25
  • Then issue may be your environment: either CPU client machine, databse configuration, or even ODBC driver. Are you connected by hard line to database via LAN? Or wireless on same network or remote network? Do you face other query performance issues beyond this append query? – Parfait Jul 03 '20 at 14:56
  • Technically, the performane issue may be a different question as original error should be resolved. – Parfait Jul 03 '20 at 14:57
0

The external SAP ASE (Sybase) dialect is now the recommended SQLAlchemy dialect for Sybase, and it does support fast_executemany if you use the SAP ASE ODBC driver.

Gord Thompson
  • 116,920
  • 32
  • 215
  • 418