Spark-Shell error: "spark.dynamicAllocation.{min/max}Executors must be set

Question

I am trying to start spark-shell after setting up Spark 1.2.1 on cloudera quick start VM. I am getting the below error.Looking for help in resolving this issue. Appreciate any quick help on this to resolve the issue. The log of the error is mentioned below:

16/03/03 09:40:37 INFO EventLoggingListener: Logging events to hdfs://quickstart.cloudera:8020/user/spark/applicationHistory/local-1457026830824
org.apache.spark.SparkException: spark.dynamicAllocation.{min/max}Executors must be set!
    at org.apache.spark.ExecutorAllocationManager.validateSettings(ExecutorAllocationManager.scala:135)
    at org.apache.spark.ExecutorAllocationManager.<init>(ExecutorAllocationManager.scala:98)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:377)
    at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
    at $iwC$$iwC.<init>(<console>:9)
    at $iwC.<init>(<console>:18)
    at <init>(<console>:20)
    at .<init>(<console>:24)
    at .<clinit>(<console>)
    at .<init>(<console>:7)
    at .<clinit>(<console>)
    at $print(<console>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852)
    at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125)
    at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669)
    at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828)
    at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873)
    at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785)
    at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:123)
    at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:122)
    at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:270)
    at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:122)
    at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:60)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:945)
    at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:147)
    at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:60)
    at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:106)
    at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:60)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:962)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
    at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:916)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1011)
    at org.apache.spark.repl.Main$.main(Main.scala:31)
    at org.apache.spark.repl.Main.main(Main.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


scala>

Yuval Itzchakov · Answer 1 · 2016-03-03T20:44:35.670

The exception is pretty clear. It seems that you've set the spark.dynamicAllocation.enabled property to true, but failed to set spark.dynamicAllocation.minExecutors and spark.dynamicAllocation.maxExecutors. The spark 1.2.1 documentation clearly states this (from spark.dynamicAllocation.enabled description, emphasis mine):

This requires the following configurations to be set: spark.dynamicAllocation.minExecutors, spark.dynamicAllocation.maxExecutors, and spark.shuffle.service.enabled

If you look at the 1.2 branch of Spark, you'll see that if you don't specify those values, the default defers to -1:

// Lower and upper bounds on the number of executors. These are required.
private val minNumExecutors = conf.getInt("spark.dynamicAllocation.minExecutors", -1)
private val maxNumExecutors = conf.getInt("spark.dynamicAllocation.maxExecutors", -1)

This behavior has changed. If you look at the updated 1.6 branch of Spark, you'll see that they defer to 0 and Integer.MAX_VALUE, respectively:

// Lower and upper bounds on the number of executors.
private val minNumExecutors = conf.getInt("spark.dynamicAllocation.minExecutors", 0)
private val maxNumExecutors = conf.getInt("spark.dynamicAllocation.maxExecutors", 
                                           Integer.MAX_VALUE)

This simply means, you need to add these either to the SparkConf settings, or to any other configuration file you're providing to the spark-shell:

val sparkConf = new SparkConf()
  .set("spark.dynamicAllocation.minExecutors", minExecutors)
  .set("spark.dynamicAllocation.maxExecutors", maxExecutors)

Thank you Mr.Yuval. Your response has helped resolve the issue. I have done the following to resolve the issue. In the spark-default.conf file, i have set the spark.dynamicAllocation.enabled=false and also ensured HDFS is started before invoking the spark shell. This has started spark context successfully. However i am not able to start the spark-sql. The error it says is unable to connect to the hive metastore. Started hive and yarn and started spark-sql but got another error and not able tostart spark-sql. Is it not possible to start spark-shell and spark-sql on spark.1.2.1 at the same time. — Naga, Mar 05 '16 at 20:57

Spark-Shell error: "spark.dynamicAllocation.{min/max}Executors must be set

1 Answers1