0

I'm working within a Synapse workspace.

I have a linked service to a oracle database in a private network (10.x.x.x)

Linked service

I have a SparkJob which is trying to connect to that database with the oracle thin client and the default method in the spark library, i.e. spark.read.jdbc, and it can not reach it.

Error message

Is it possible for the spark job to reach the database?

I've tried to open firewall connections but can not find from where to where because the Spark pool is in a virtual network for which I don't know the network segment.

Aswin
  • 4,090
  • 2
  • 4
  • 16

1 Answers1

0

In order to read data from oracle DB using Pyspark inside synapse notebook The Below steps will help you:

Step 1: Add the jar ojdbc8-19.15.0.0.1.jar to the Synapse workspace packages.

Step 2: Add that workspace package to the Apache spark pool

Step 3:execute the following code:

host = 'my_host_url'  
port = 1521  
service_name = 'my_service_name'  
jdbcUrl = f'jdbc:oracle:thin:@{host}:{port}:{service_name}'  
    sql = 'SELECT * FROM my_table'  
    user = 'my_username'  
    password = 'my_password'  
    jdbcDriver = 'oracle.jdbc.driver.OracleDriver'  
    jdbcDF = spark.read.format('jdbc') \  
        .option('url', jdbcUrl) \  
        .option('query', sql) \  
        .option('user', user) \  
        .option('password', password) \  
        .option('driver', jdbcDriver) \  
        .load()  
    display(jdbcDF)

By following the above steps you can read data from oracle db in spark notebook.

  • Hi @Dileeprajnarayan, I tried your solution but got the same error: `Py4JJavaError: An error occurred while calling o3902.load. : java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection Caused by: java.net.ConnectException: Connection timed out` Do you think that I have be aware of any network rules? – Ricardo Olvera Jul 27 '23 at 00:52
  • @RicardoOlvera try allowing the network and IP address try the above process. Also in the beginning of the code. use from pyspark.sql import SparkSession and this spark = SparkSession.builder \ .appName("MyApp") \ .config("spark.driver.extraClassPath", "/path/to/oracle-jdbc-driver.jar") \ .getOrCreate() – DileeprajnarayanThumula Jul 27 '23 at 03:37
  • @RicardoOlvera check this https://www.youtube.com/watch?v=sQe6VYjS7UY&t=401s&ab_channel=WafaStudies – DileeprajnarayanThumula Jul 27 '23 at 04:08