- As documented (and seen in code) any config item in Spark Config prefixed with
spark.hadoop
is copied over to Hadoop Config without the prefix. So to set my.mapreduce.setting
in Hadoop conf, set spark.hadoop.my.mapreduce.setting
in Spark conf.
The better choice is to use spark hadoop properties in the form of spark.hadoop., and use spark hive properties in the form of spark.hive.. For example, adding configuration “spark.hadoop.abc.def=xyz” represents adding hadoop property “abc.def=xyz”, and adding configuration “spark.hive.abc=xyz” represents adding hive property “hive.abc=xyz”. They can be considered as same as normal spark properties...
- To set spark config
name
to value
.
# python
spark.conf.set(<name>, <value>)
// scala
spark.conf.set(<name>, <value>)
# R
library(SparkR)
sparkR.session()
sparkR.session(sparkConfig = list(name = "<value>"))
-- SQL
SET <name> = <value>;
Command line with spark-submit
or pyspark
or spark-shell
:
--conf "name=value"
- Precedence. As documented.
Properties set directly on the SparkConf take highest precedence, then flags passed to spark-submit or spark-shell, then options in the spark-defaults.conf file.
Some links: