I'm trying around with a Spark Standalone Cluster. I wanted to setup the cluster settings, that an application always uses as much resources of the cluster as possible. I changed the following settings:
spark-defaults.conf (on Master node)
spark.driver.cores (works fine)
spark.driver.memory (works fine)
spark.executor.cores
spark.executor.memory
spark-env.sh (on each node, depending on the available hardware)
SPARK_WORKER_CORES
SPARK_WORKER_MEMORY
Am I right, that each Executor always runs with the same resources since the settings spark.executor.cores
& spark.executor.memory
are set globally?
Is there no possibility to set different values per machine to use the resources more efficiently? My cluster has the following hardware:
- Master: 12 CPU Cores & 128 GB RAM (~10 GB RAM / Core)
- Slave1: 12 CPU Cores & 64 GB RAM (~ 5 GB RAM / Core)
- Slave2: 6 CPU Cores & 64 GB RAM (~10 GB RAM / Core)
As you can see, the RAM/Core value is very different, this seems to be the issue of the problem or not?