0

I'm trying around with a Spark Standalone Cluster. I wanted to setup the cluster settings, that an application always uses as much resources of the cluster as possible. I changed the following settings:

spark-defaults.conf (on Master node)
  spark.driver.cores  (works fine)
  spark.driver.memory (works fine)
  spark.executor.cores
  spark.executor.memory
spark-env.sh (on each node, depending on the available hardware)
  SPARK_WORKER_CORES
  SPARK_WORKER_MEMORY

Am I right, that each Executor always runs with the same resources since the settings spark.executor.cores & spark.executor.memory are set globally?

Is there no possibility to set different values per machine to use the resources more efficiently? My cluster has the following hardware:

 - Master: 12 CPU Cores & 128 GB RAM (~10 GB RAM / Core)
 - Slave1: 12 CPU Cores &  64 GB RAM (~ 5 GB RAM / Core)
 - Slave2:  6 CPU Cores &  64 GB RAM (~10 GB RAM / Core)

As you can see, the RAM/Core value is very different, this seems to be the issue of the problem or not?

D. Müller
  • 3,336
  • 4
  • 36
  • 84

2 Answers2

1

Just change spark-env.sh Master and each slave ( Executor node)

Master : spark-env.sh

SPARK_WORKER_CORES=12
SPARK_WORKER_MEMORY=10g

Slave 1 : spark-env.sh

SPARK_WORKER_CORES=12
SPARK_WORKER_MEMORY=5g

Slave 2 : spark-env.sh

SPARK_WORKER_CORES=6
SPARK_WORKER_MEMORY=10g

Have a look at this documentation link for configuration.

Ravindra babu
  • 37,698
  • 11
  • 250
  • 211
  • Thanks for your comment. I already tried this, the hardware is "reserved" for the cluster, but there's still only 1024m per node (=worker)! If I increase this value via spark.executor.memory, every executor has the same memory amount - but this won't utilize the whole hardware resources (because of the 64 vs 128 GB RAM) – D. Müller Jan 08 '16 at 12:21
  • This should work. I have found related question just now : http://stackoverflow.com/questions/24242060/how-to-change-memory-per-node-for-apache-spark-worker. Have a look into it. – Ravindra babu Jan 08 '16 at 12:31
  • There's also the problem, that each executor gets the same amount of memory, e.g. 60g. So my Slave1 & Slave2 would be utilized, but not my Master instance (only 60g of 120g used)! – D. Müller Jan 08 '16 at 12:33
  • set spark.driver.memory for master in spark default configuraiton – Ravindra babu Jan 08 '16 at 13:20
0

OK, I got it now - set the following parameters:

spark-defaults.conf (on Master):
  spark.executor.memory 60g
  (spark.executor.cores not set! -> executor can use all available -> see setting SPARK_EXECUTOR_CORES below!)

As I want to use the "strong" Master node for Driver AND Executor processes, I also set the following:

spark-env.sh (on Master):
  SPARK_EXECUTOR_CORES=10 (-> use 10 of the 12 Cores for executors -> the other 2 are planned for the Driver process)
  SPARK_EXECUTOR_MEMORY=60g (-> set only 60g, because the setting spark.executor.memory sets the memory also to this size, set here more memory would be a waste of the resource!)

spark-defaults.conf (on Master)
  (spark.executor.memory set as written above (=60g))
  spark.driver.cores 2 (-> use the 2 cores, which aren't use for executors yet)
  spark.driver.memory 10g (can be up to ~60g, as there are only used 60 of 128 GB for the executors)
D. Müller
  • 3,336
  • 4
  • 36
  • 84