SparkR-submit core not allocated

Question

Hi I'm working on SparkR on yarn mode.

When I submit an application in this way:

./spark-submit --master yarn-client --packages com.databricks:spark-
csv_2.10:1.0.3 --driver-memory 6g --num-executors 8 --executor-memory 6g 
--total-executor-cores 32 --executor-cores 8 /home/sentiment/Scrivania/test3.R

One node start as AM (I think is chosen randomly) and take 1gb of Memory and 1 Vcore. After that ALL nodes has 7Gb of Memory and 1 Vcore for each one. (Except for the node who starts AM that has 8gb and 2core)

Why nodes do not acquire 4 cores as configuration/spark submit says?

spark-default

spark.master    spark://server1:7077
spark.serializer        org.apache.spark.serializer.KryoSerializer
spark.driver.memory 5g
spark.executor.memory   6g
spark.executor.cores    4
spark.akka.frameSize    1000
spark.yarn.am.cores 4
spark.kryoserializer.buffer.max 700m
spark.kryoserializer.buffer 100m

Yarn-manager

<configuration>
   <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
   </property>
   <property>
      <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
   </property>
   <property>
      <name>yarn.resourcemanager.resource-tracker.address</name>
      <value>server1:8025</value>
   </property>
   <property>
      <name>yarn.resourcemanager.scheduler.address</name>
      <value>server1:8035</value>
   </property>
   <property>
      <name>yarn.resourcemanager.address</name>
      <value>server1:8050</value>
   </property>
   <property>
      <name>yarn.resourcemanager.webapp.address</name>
      <value>server1:8088</value>
   </property>
   <property>
      <name>yarn.scheduler.minimum-allocation-vcores</name>
      <value>4</value>
   </property>
</configuration>

Update1:

Read from old post that I needed to change the value of this property below from Default to Dominant at capacity-scheduler.xml

<property>
   <name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>

Added at Spark-env

SPARK_EXECUTOR_CORES=4

Nothing changed.

Update2: I read this from spark official page, so 1 core for each executor in Yarn mode is the maximum value?

spark.executor.cores The number of cores to use on each executor. For YARN and standalone mode only. In standalone mode, setting this parameter allows an application to run multiple executors on the same worker, provided that there are enough cores on that worker. Otherwise, only one executor per application will run on each worker.

can you change setenv.sh? Have a look at : http://stackoverflow.com/questions/34762432/spark-ignores-spark-worker-memory/ — Ravindra babu, Jan 18 '16 at 10:38
I added at `spark-env` SPARK_EXECUTOR_CORES=4 but nothing changed — DanieleO, Jan 18 '16 at 11:27

SparkR-submit core not allocated

0 Answers0