14

We want to set the aws parameters that from code would be done via the SparkContext:

sc.hadoopConfiguration.set("fs.s3a.access.key", vault.user)
sc.hadoopConfiguration.set("fs.s3a.secret.key", vault.key)

However we have a custom Spark launcher framework that requires all the custom Spark configurations to be done via --conf parameters to the spark-submit command line.

Is there a way to "notify" the SparkContext to set --conf values to the hadoopConfiguration and not to its general SparkConf ? Looking for something along the lines of

spark-submit --conf hadoop.fs.s3a.access.key $vault.user --conf hadoop.fs.s3a.access.key $vault.key

or

spark-submit --conf hadoopConfiguration.fs.s3a.access.key $vault.user --conf hadoopConfiguration.fs.s3a.access.key $vault.key
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560

1 Answers1

43

You need to prefix Hadoop configs with spark.hadoop. in the command line (or SparkConf object). For example:

spark.hadoop.fs.s3a.access.key=value
zero323
  • 322,348
  • 103
  • 959
  • 935
vanza
  • 9,715
  • 2
  • 31
  • 34