How to safely pass credentials to jdbc interface in Pyspark

Asked Dec 10 '16 at 01:12

Active Dec 10 '16 at 05:52

Viewed 57 times

Questions like this one seem to indicate that a database can be queried directly from pyspark, I would like to update a data pipeline that uses sqoop to use this instead. But with sqoop you may use -P and your credentials will be hidden. I don't see how to use this for the jdbc interface, all the examples I can find suggest hardcoding usn/pass into the scripts. The data in my environment is sensitive so I cannot do this.

df = sqlCtx.load(source="jdbc",
                 url="jdbc:oracle:thin://x.x.x.x/xdb?user=****&password=****",
                 dbtable="somequery")

I have read that even libraries such as getpass which hide the input from the terminal are sometimes vulnerable to memory attacks. Is there a safe way to do this?

edited May 23 '17 at 10:29

Community

asked Dec 10 '16 at 01:12

FaceInvader

How to safely pass credentials to jdbc interface in Pyspark

0 Answers0