I have a REST API in Scala Spray that triggers Spark jobs like the following:
path("vectorize") {
get {
parameter('apiKey.as[String]) { (apiKey) =>
if (apiKey == API_KEY) {
MoviesVectorizer.calculate() // Spark Job run in a Thread (returns Future)
complete("Ok")
} else {
complete("Wrong API KEY")
}
}
}
}
I'm trying to find the way to specify Spark driver memory for the jobs. As I found, configuring driver.memory from within the application code doesn't effect anything.
The whole web application along with the Spark is packaged in a fat Jar. I run it by running
java -jar app.jar
Thus, as I understand, spark-submit is not relevant here (or is it?). So, I can not specify --driver-memory option when running the app.
Is there any way to set the driver memory for Spark within the web app?
Here's my current Spark configuration:
val spark: SparkSession = SparkSession.builder()
.appName("Recommender")
.master("local[*]")
.config("spark.mongodb.input.uri", uri)
.config("spark.mongodb.output.uri", uri)
.config("spark.mongodb.keep_alive_ms", "100000")
.getOrCreate()
spark.conf.set("spark.executor.memory", "10g")
val sc = spark.sparkContext
sc.setCheckpointDir("/tmp/checkpoint/")
val sqlContext = spark.sqlContext
As it is said in the documentation, Spark UI Environment tab shows only variables that are effected by the configuration. Everything I set is there - apart from spark.executor.memory.