1

I'm using standalone cluster mode, 1.5.2.

Even though I'm setting SPARK_WORKER_MEMORY in spark-env.sh, it looks like this setting is ignored.

I can't find any indications at the scripts under bin/sbin that -Xms/-Xmx are set.

If I use ps command the worker pid, it looks like memory set to 1G:

[hadoop@sl-env1-hadoop1 spark-1.5.2-bin-hadoop2.6]$ ps -ef | grep 20232
hadoop   20232     1  0 02:01 ?        00:00:22 /usr/java/latest//bin/java 
-cp /workspace/3rd-party/spark/spark-1.5.2-bin-hadoop2.6/sbin/../conf/:/workspace/
3rd-party/spark/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar:/workspace/
3rd-party/spark/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/workspace/
3rd-party/spark/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/workspace/
3rd-party/spark/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/workspace/
3rd-party/hadoop/2.6.3//etc/hadoop/ -Xms1g -Xmx1g org.apache.spark.deploy.worker.Worker 
--webui-port 8081 spark://10.52.39.92:7077

spark-defaults.conf:

spark.master            spark://10.52.39.92:7077
spark.serializer        org.apache.spark.serializer.KryoSerializer
spark.executor.memory   2g
spark.executor.cores    1

spark-env.sh:

export SPARK_MASTER_IP=10.52.39.92
export SPARK_WORKER_INSTANCES=1
export SPARK_WORKER_MEMORY=12g

Am I missing something?

Thanks.

Seffy
  • 1,045
  • 1
  • 13
  • 27

4 Answers4

0

When using spark-shell or spark-submit, use the --executor-memory option.

When configuring it for a standalone jar, set the system property programmatically before creating the spark context.

System.setProperty("spark.executor.memory", executorMemory)

Radu Ionescu
  • 3,462
  • 5
  • 24
  • 43
0

You are using wrong setting in cluster mode.

SPARK_EXECUTOR_MEMORY is the right option to set Executor memory in cluster mode.

SPARK_WORKER_MEMORY works only in standalone deploy mode.

Otherway to set executor memory from command line : -Dspark.executor.memory=2g

Have a loook at one more related SE question regarding these settings :

Spark configuration, what is the difference of SPARK_DRIVER_MEMORY, SPARK_EXECUTOR_MEMORY, and SPARK_WORKER_MEMORY?

Community
  • 1
  • 1
Ravindra babu
  • 37,698
  • 11
  • 250
  • 211
  • I'm doing my first steps with Spark, hopefully I'm phrasing it correct: I've 5 nodes cluster, not managed by yarn/mesos, isn't it standalone cluster mode? I do set SPARK_WORKER_MEMORY to 12G, it indeed reported in the web ui as 12g, but from command line looks like the jvm is configured with only 1G, as you can see in ps output in the question. – Seffy Jan 13 '16 at 10:18
  • 1
    You are running in cluster mode and hence you are using below setting : spark://10.52.39.92:7077 – Ravindra babu Jan 13 '16 at 10:29
  • Not sure I understand. Given the original question updates, am I using the correct settings? Why does ps still reports 1G? – Seffy Jan 13 '16 at 10:59
  • Given the original question updates, you have use SPARK_EXECUTOR_MEMORY setting instead of SPARK_WORKER_MEMORY setting. Just try this option and see this result. – Ravindra babu Jan 13 '16 at 11:01
0

This is my configuration on cluster mode, on spark-default.conf

spark.driver.memory 5g
spark.executor.memory   6g
spark.executor.cores    4

Did have something like this?

If you don't add this code (with your options) Spark executor will get 1gb of Ram as default.

Otherwise you can add these options on ./spark-submit like this :

# Run on a YARN cluster
export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode cluster \  # can be client for client mode
  --executor-memory 20G \
  --num-executors 50 \
  /path/to/examples.jar \
  1000

Try to check on master(ip/name of master):8080 when you run an application if resources have been allocated correctly.

DanieleO
  • 462
  • 1
  • 7
  • 20
  • Thanks for answering. My cluster is not managed by yarn/mesos. I've added the conf files to question, doe's it make sense? – Seffy Jan 13 '16 at 10:26
  • Yes. The example that i posted of spark-submit was for yarn. Anyway you if you change the `master` it works. You can add `--executor-memory 20G` – DanieleO Jan 13 '16 at 10:30
  • @DanieleO What if the application is being run locally via `spark-submit`? So far, adjusting the `executor-memory` via the submission options has not been helping. – Brian Mar 12 '16 at 18:17
0

I've encountered the same problem as yours. The reason is that, in standalone mode, spark.executor.memory is actually ignored. What has an effect is spark.driver.memory, because the executor is living in the driver.

So what you can do is to set spark.driver.memory as high as you want.

This is where I've found the explanation: How to set Apache Spark Executor memory

Community
  • 1
  • 1
Lewen
  • 1,823
  • 2
  • 15
  • 17