0

I would like to secure an Oracle connection with PySpark using Oracle's Wallet.

As for now, I am hardcoding the credentials and performing a working connection with the following:

    # Connection properties
    driver='oracle.jdbc.driver.OracleDriver'
    connString='jdbc:oracle:thin:@1.1.1.1.sample/SID'
    user='user'
    pwd='pwd' 
    jdbc_dict = {'driver':driver, 
               'connection':connString, 
               'user':user, 
               'pwd':pwd }
    
    df_table = spark.read.format('jdbc')\
                    .option('driver',jdbc_dict['driver'])\
                    .option('url',jdbc_dict['connection'])\
                    .option('dbtable',table_name)\
                    .option('user',jdbc_dict['user'])\
                    .option('password',jdbc_dict['pwd']).load()

Now, I am moving to another environment and want to get rid of credentials from the code.

Assuming that I have correctly created a Wallet on my client, how do I change the previously shown code to get it working with the Wallet?

I believe I have to point at the Wallet's path in some way but couldn't find any proper documentation regarding Spark, more precisely, PySpark.

F.Peconi
  • 56
  • 9

1 Answers1

0

Use yarn with the -- files parameter in yarn Cluster mode to copy the wallet to your nodes. And then simply read it as an local file path

Georg Heiler
  • 16,916
  • 36
  • 162
  • 292
  • How should I actually read it from Spark? Is there any particular option that I should set during `spark.read` , perhaps you mean something like [this](https://blog.yannickjaquier.com/oracle/secure-external-password-store-implementation.html) ? – F.Peconi Sep 03 '21 at 13:23
  • Basically it is just like: https://stackoverflow.com/questions/57330285/how-to-access-external-property-file-in-spark-submit-job, and no you do not need/want to use spark.read, but just a local jvm read operation on i.e. the driver – Georg Heiler Sep 03 '21 at 13:25
  • If I am getting what you say, with this method I will be able to reference the wallet within my nodes and that is nice. However, I must also tell my connection (done with spark.read), that now I am using the wallet for the auth and no credentials anymore. This is where my gap is – F.Peconi Sep 03 '21 at 13:38
  • 1
    But simply make any regular Java jdbc connection to oracle and use any regular examples of hoe to use a wallet with Java and you should be set up to get it running as you can now access the wallet file on the cluster nodes – Georg Heiler Sep 03 '21 at 18:31