10

I'm trying to connect from a Databricks notebook to an Azure SQL Datawarehouse using the pyodbc python library. When I execute the code I get this error:

Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)")

I understand that I need to install this driver but I have no idea how to do it. I have a Databricks cluster runing with Runetime 6.4, Standard_DS3_v2.

user2364105
  • 138
  • 1
  • 1
  • 6

2 Answers2

19

By default, Azure Databricks does not have ODBC Driver installed.

Run the following commands in a single cell to install MY SQL ODBC Driver on Azure Databricks cluster.

%sh
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17

enter image description here

CHEEKATLAPRADEEP
  • 12,191
  • 1
  • 19
  • 42
  • You mean "... MS SQL ODBC ..."? – Gord Thompson Apr 27 '20 at 11:13
  • Same script is failing while run it using InitScript `"invalid option "- E: Invalid operation update E: Unable to locate package msodbcsql17 E: Command line option ' ' [from -y ] is not understood in combination with the other options.` – Manish Jain Jun 25 '21 at 05:33
  • This works; to get the right version, you can run this first. ```%sh cat /etc/*release``` As you almost certainly are not using ubuntu 16.04 on any Databricks system. – John Stud Aug 21 '23 at 12:56
1

Instead of using the ODBC driver why don't you use the spark driver of Azure Synapse (aka SQL Data warehouse), databricks clusters have this driver installed by default ( com.databricks.spark.sqldw" ) .

Documentation : https://docs.databricks.com/data/data-sources/azure/synapse-analytics.html#language-python

Example of use :

df = spark.read \
.format("com.databricks.spark.sqldw") \
.option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>") \
.option("tempDir", "wasbs://<your-container-name>@<your-storage-account- 
name>.blob.core.windows.net/<your-directory-name>") \
.option("forwardSparkAzureStorageCredentials", "true") \
.option("dbTable", "my_table_in_dw") \
.load()
jegordon
  • 1,157
  • 3
  • 14
  • 17
  • If my answer is helpful for you, you can accept it as answer( click on the check mark beside the answer to toggle it from greyed out to filled in.). This can be beneficial to other community members. Thank you. – CHEEKATLAPRADEEP May 04 '20 at 12:21
  • Thank you very much for your advice, I could succesfully install odbc driver on db cluster but, as you said, is recommended to use spark driver for most of the operations with Azure Synapse. – user2364105 Aug 14 '20 at 16:02
  • 1
    The odbc driver is required to execute stored procedures – Jason Horner Feb 05 '21 at 23:30