0

I have made a self hosted integration runtime and created a linked service to sql server but having problem in fetching data in synapse notebook using pyspark code.

I want pyspark code for Fetching data from On-Premise Sql Server to Azure Synapse Notebook using Linked Service and IR

I have updated the code

enter image description here Actually my server name is AKHAYASQL/SQL2019 and its in my local machine i am trying to connect it using self hosted IR

error message: enter image description here

  • Could you please provide your approach and error if any got? – Bhavani Aug 22 '23 at 04:07
  • so first i have created self hosted IR and used that in copy activity. Its working fine .but when i am trying to access using synapse notebook using jdbc having problem .Is there any method to access directly on-premise SQL server to synapse notebook through self hosted IR and liked service – Akhaya Chandan Mishra Aug 22 '23 at 04:50
  • Could you please provide the error which you are getting? – Bhavani Aug 22 '23 at 04:56
  • Py4JJavaError: An error occurred while calling o3914.load. : java.sql.SQLException: No suitable driver at java.sql.DriverManager.getDriver(DriverManager.java:315) – Akhaya Chandan Mishra Aug 22 '23 at 05:00
  • Could you please provide your sample code? – Bhavani Aug 22 '23 at 05:00
  • code-from pyspark.sql import SparkSession # Create a Spark session and configure the external JAR spark = SparkSession.builder \ .appName("SQLServer Connection") \ .config("spark.jars", "synfs:/935/abc/mssql-jdbc-7.2.1.jre8.jar") \ .getOrCreate() # Configure linked service credentials linked_service_name = "SqlServer1" df = spark.read \ .format("jdbc") \ .option("url", f"jdbc:sqlserver://;{linked_service_name}") \ .option("dbtable", "[dbo].[employee]") \ .load() # Show the data df.show() # Stop the Spark session spark.stop() – Akhaya Chandan Mishra Aug 22 '23 at 05:00

1 Answers1

0

When I tried to replicate the issue in my environment, with below code:

linked_service_name = "SqlServer1"
df = spark.read \
.format("jdbc") \
.option("url", f"jdbc:sqlserver://;{linked_service_name}") \ 
.option("dbtable", "dbo.student") \ 
.load()
df.show()

I got the same error:

enter image description here

I added driver to the code as mentioned below:

jdbcDriver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"       
linked_service_name = "SqlServer1"
df = spark.read \
    .format("jdbc") \
    .option("driver", jdbcDriver)\
    .option("url", f"jdbc:sqlserver://;{linked_service_name}") \
    .option("dbtable", "dbo.student") \
    .load()
df.show()

I got below error:

enter image description here

As per this Jdbc URl should be in below format:

jdbc:sqlserver://[serverName[\instanceName][:portNumber]][;property=value[;property=value]]

So, I connected to SQL server with above jdbc URL format with below code:

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("OnPremSQLFetch") \
    .getOrCreate()

query = "SELECT * FROM student"
data = spark.read \
    .format("jdbc") \
    .option("url", f"jdbc:sqlserver://<serverName>:1433;databaseName=<dbName>") \
    .option("dbtable", f"({query})") \
    .option("user", "<userName>") \
    .option("password", "<password>") \
    .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
    .load()

data.show()

It connected to on-premises sql server successfully.

enter image description here

If the on-premises data source is not publicly accessible, then copy the data from on-premises to Azure SQL database using this and retrieve data from Azure SQL database with mentioned code.

As per this it is not possible to connect on-premises sql server directly in synapse notebook. You can follow above procedures.

Bhavani
  • 1,725
  • 1
  • 3
  • 6