1

I have 45 pyspark scripts to run where a password is stored in each script. I want to use a file placed in HDFS where I can store the password and use this for all the scripts.

Instead of changing password, I will do in file (please refer to the script below).

from pyspark.context import SparkContext
from pyspark.sql import HiveContext
from pyspark.sql.functions import *
from pyspark.sql.types import *

sc = SparkContext()
sqlContext = HiveContext(sc)
sqlContext.setConf("spark.sql.tungsten.enabled", "false")

CSKU_query = """ (select * from CSKU) a """

CSKU = sqlContext.read.format("jdbc").options(url="jdbc:sap://myip:port",currentschema="SAPABAP1",user="username",password="mypassword",dbtable=CSKU_query).load()

CSKU.write.format("parquet").save("/user/admin/sqoop/base/sap/CSKU/")

Instead of specifying password in each script, it should fetch from file where i can refer that.

Thanks in advance

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
Ankit
  • 41
  • 1
  • 5
  • 3
    This isn't related to Spark at all. Your question boils down to “how do I read a variable from a configuration file”. There are many ways to do so and many of those are on StackOverflow. What are you having trouble with? – Oliver W. Nov 28 '19 at 11:11
  • 2
    Does this answer your question? [How to read a config file using python](https://stackoverflow.com/questions/19379120/how-to-read-a-config-file-using-python) – Oliver W. Nov 28 '19 at 11:12
  • i am not able to get the syntex to read the password from file in pyspark – Ankit Nov 28 '19 at 11:40
  • 1
    You don't need to use Spark to read a configuration file. Just use normal Python constructs, like `with open(configfile)`. – Oliver W. Nov 28 '19 at 12:07
  • getting below error while fetching --password-file argument in script in oozie sc = SparkContext() sqlContext = HiveContext(sc) sqlContext.setConf("spark.sql.tungsten.enabled", "false") ZTGLINT011_query = """ (select * from ZTGLINT011) a """ ZTGLINT011 = sqlContext.read.format("jdbc").options(url="jdbc:sap://172.28.1.121:30015",currentschema="SAPABAP1",user="SCRIBE",--password-file="hdfs://user/admin/newpwd/passwd/pwd.txt",dbtable=ZTGLINT011_query).load() Stdoutput SyntaxError: keyword can't be an expression – Ankit Nov 28 '19 at 13:34
  • That's a different issue, which requires a different question, Ankit. In the new question, add the code that produces that error message (comments are not the place to add long snippets of code - it’s very hard to read). – Oliver W. Nov 28 '19 at 19:20

0 Answers0