0

I am trying to run the code below.

import pandas as pd
from sqlalchemy import create_engine
import urllib
#import pyodbc

params = urllib.parse.quote_plus("DRIVER='{ODBC Driver 17 for SQL Server}';SERVER=server.database.windows.net;DATABASE=my_db;UID=my_id;PWD=my_pwd")
myeng = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)

df.to_sql(name="dbo.my_table_name", con=myeng, if_exists='append', index=False)

I get an error when I hit the last line of code. I am getting this error.

DBAPIError: (pyodbc.Error) ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib ''{ODBC Driver 17 for SQL Server}'' : file not found (0) (SQLDriverConnect)")

I am reading through the documentation here.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html

Everything pretty much makes sense to me, but I'm not sure how to reference the SQL Server driver. When I look at the DOBC setup on my laptop, I see this.

enter image description here

I think this is ok, but I'm actually pushing data to an Azure Data Warehouse (on a server machine, not my local machine). I'm not sure how to check the driver on that DB sitting on the server. Also, I'm not totally sure, but the problem seems to come from either the DRIVER or the SERVER. Basically, I am just looking for some guidance as to how to make this work. Thanks!

ASH
  • 20,759
  • 19
  • 87
  • 200
  • Try removing the single quotes around the ODBC driver name. – Parfait Nov 25 '19 at 18:09
  • Also, discontinue using the `%` string format method. In Python 3, it has been [de-emphasised, (but not officially deprecated *yet*)](https://stackoverflow.com/a/13452357/1422451). – Parfait Nov 25 '19 at 18:11
  • I removed the single quotes; still getting the same error. – ASH Nov 25 '19 at 18:27
  • I removed the '%' characters and now I get an error on that same line of code: SyntaxError: invalid syntax – ASH Nov 25 '19 at 18:28
  • Hmmm...same exact error: `file not found`? Does a raw pyodbc (non sqlAlchemy connection) work: `pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};...')`? Also, what is your bit version: `import platform; print(platform.architecture()[0])`? – Parfait Nov 25 '19 at 18:36
  • Also, don't just remove `%`. You need to replace it with `str.format`: see [docs](https://docs.python.org/3/library/stdtypes.html#str.format) and answer in link. – Parfait Nov 25 '19 at 18:37
  • It is 64bit. I tried 'pyodbc.connect' and got the same result: Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect) It shouldn't be this hard. I feel like something is mis-configured or something got corrupted. I found several examples of this kind of thing online and it seems like it should totally work. – ASH Nov 25 '19 at 19:14
  • One thing definitely looks weird. If I use the '%' character, the error occurs on this line: 'params = pyodbc.connect'. If I use 'str.format', the error occurs on this line: 'myeng = sqlalchemy.create_engine'. – ASH Nov 25 '19 at 19:17

1 Answers1

1

To check installed drivers or DSNs on client machine, use the following lists from pyodbc:

# LIST OF INSTALLED DATA SOURCES (DSNs)
print(pyodbc.dataSources())

# LIST OF INSTALLED DRIVERS
print(pyodbc.drivers())

Do note: 32 or 64-bit versions will only appear on the analogous bit-version of your Python installation (i.e., only 32-bit drivers will show on Python 32-bit and similarly for 64-bit). You show a list of 64-bit drivers but may be running Python 32-bit of which none of those are available. Recall Windows maintains two odbc executables usually in below system folders:

  • C:\Windows\System32\odbcad32.exe (your screenshot)
  • C:\Windows\SysWOW64\odbcad32.exe
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • When I run those 2 commands I get this: {'spark': '/opt/simba/sparkodbc/lib/64/libsimbasparkodbc64.so'} [] – ASH Nov 25 '19 at 19:49
  • So, I'm guessing the ODBC driver is not installed. Is that right? Again, the source is a dataframe and the destination is an Azure Data Warehouse (SQL Server). – ASH Nov 25 '19 at 19:50
  • I just saw this: https://datathirst.net/blog/2018/10/12/executing-sql-server-stored-procedures-on-databricks-pyspark The guy is running some curl commands to get the environment setup. Then, towards the end of the page he runs some SQL. The code is totally different from what I am trying to do, but the concept is pretty similar. That seems like overkill thought. I would expect SQL Server, and all related drivers, to be setup to handle this kind of task. – ASH Nov 25 '19 at 20:04
  • What machine is this Python script running on? Are you running in a virtual env or VPN to another machine? Whatever env/machine this script runs must have that ODBC driver installed for either local or remote connection to SQL Server. Your screenshot must be from another machine as this list shows installed drivers on machine running Python. You can freely [download driver](https://www.microsoft.com/en-us/download/details.aspx?id=56567). – Parfait Nov 25 '19 at 20:52
  • It's running on Azure. I think that's the issue. I think I know the direction to go in now. I have to do more research. I'll follow up when I have a working solution. Thanks for trying to help!! – ASH Nov 26 '19 at 00:49