How to change memory per node for apache spark worker

Question

I am configuring an Apache Spark cluster.

When I run the cluster with 1 master and 3 slaves, I see this on the master monitor page:

Memory
2.0 GB (512.0 MB Used)
2.0 GB (512.0 MB Used)
6.0 GB (512.0 MB Used)

I want to increase the used memory for the workers but I could not find the right config for this. I have changed spark-env.sh as below:

export SPARK_WORKER_MEMORY=6g
export SPARK_MEM=6g
export SPARK_DAEMON_MEMORY=6g
export SPARK_JAVA_OPTS="-Dspark.executor.memory=6g"
export JAVA_OPTS="-Xms6G -Xmx6G"

But the used memory is still the same. What should I do to change used memory?

Sorry for the not good question. I find that I want to change the memory for executors. Now, the executor only use `2.0 GB (512.0 MB Used)`. How to increase the memory by changing config or system environment? — Minh Ha Pham, Jun 17 '14 at 02:42

samthebest · Accepted Answer · 2016-03-10T17:53:45.647

When using 1.0.0+ and using spark-shell or spark-submit, use the --executor-memory option. E.g.

spark-shell --executor-memory 8G ...

0.9.0 and under:

When you start a job or start the shell change the memory. We had to modify the spark-shell script so that it would carry command line arguments through as arguments for the underlying java application. In particular:

OPTIONS="$@"
...
$FWDIR/bin/spark-class $OPTIONS org.apache.spark.repl.Main "$@"

Then we can run our spark shell as follows:

spark-shell -Dspark.executor.memory=6g

When configuring it for a standalone jar, I set the system property programmatically before creating the spark context and pass the value in as a command line argument (I can make it shorter than the long winded system props then).

System.setProperty("spark.executor.memory", valueFromCommandLine)

As for changing the default cluster wide, sorry, not entirely sure how to do it properly.

One final point - I'm a little worried by the fact you have 2 nodes with 2GB and one with 6GB. The memory you can use will be limited to the smallest node - so here 2GB.

`spark.executor.memory` is the memory used by the application (job), not the memory allocated for the worker. — maasg, Jun 16 '14 at 12:48
Ideally you would set the value in the spark.env.sh file. This allows you to set the default without having to pass in the argument every time you run the shell. — Climbs_lika_Spyder, Mar 27 '15 at 11:47

score 12 · Answer 2 · edited Mar 27 '15 at 12:27

12

In Spark 1.1.1, to set the Max Memory of workers. in conf/spark.env.sh, write this:

export SPARK_EXECUTOR_MEMORY=2G

If you have not used the config file yet, copy the template file

cp conf/spark-env.sh.template conf/spark-env.sh

Then make the change and don't forget to source it

source conf/spark-env.sh

edited Mar 27 '15 at 12:27

Climbs_lika_Spyder

6,004
3
39
53

answered Dec 18 '14 at 23:25

Tristan Wu

141
1
5

This worked for me in Spark 1.3.0 (without the export). I did source the spark-env.sh though – Climbs_lika_Spyder Mar 27 '15 at 11:43
@tristan-wu isn't this specific for yarn? – dirceusemighini Mar 30 '16 at 18:49

score 10 · Answer 3 · answered Jun 18 '14 at 05:39

10

In my case, I use ipython notebook server to connect to spark. I want to increase the memory for executor.

This is what I do:

from pyspark import SparkContext
from pyspark.conf import SparkConf

conf = SparkConf()
conf.setMaster(CLUSTER_URL).setAppName('ipython-notebook').set("spark.executor.memory", "2g")

sc = SparkContext(conf=conf)

answered Jun 18 '14 at 05:39

Minh Ha Pham

2,566
2
28
43

2

I was looking for how to set the memory in ipython for 2 days now, and yours is the only one that worked for me. – ninehundred Nov 20 '15 at 16:41
1

yep, this was the only thing that worked for me to boost myself out of the 1024MB OOM hell! :D – ruoho ruotsi Aug 03 '17 at 20:02

score 9 · Answer 4 · answered Aug 26 '14 at 20:05

According to Spark documentation you can change the Memory per Node with command line argument --executor-memory while submitting your application. E.g.

./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://master.node:7077 \
  --executor-memory 8G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

I've tested and it works.

score 1 · Answer 5 · answered Jun 16 '14 at 12:58

1

The default configuration for the worker is to allocate Host_Memory - 1Gb for each worker. The configuration parameter to manually adjust that value is SPARK_WORKER_MEMORY, like in your question:

export SPARK_WORKER_MEMORY=6g.

answered Jun 16 '14 at 12:58

maasg

37,100
11
88
115

I get your point, and that's kinda how the question is phrased, but I doubt that is what he wants though judging by what he describes on the master monitor page. – samthebest Jun 16 '14 at 15:55
@massag In my case, each worker only use **512MB** by default. When I add setting `export SPARK_WORKER_MEMORY=6g`. It does not increate memory for workers. I still see memory for each worker is **512MB** on master monitor page – Minh Ha Pham Jun 16 '14 at 17:28
The master page shows the worker total memory and the currently used memory by a job. Your workers have 2/2/6Gb of total memory and are currently using 512Mb. That's the task executor's memory usage. To change that use `spark.executor.memory`. See @samthebest answer. – maasg Jun 16 '14 at 19:04
1

@massag I want to increate task executor's memory usage. I have added this line `export SPARK_JAVA_OPTS="-Dspark.executor.memory=6g"` to **spark-env.sh** but it is still **512MB**. What it the right way to do that? – Minh Ha Pham Jun 17 '14 at 02:35
I have the same issue – crak Feb 19 '16 at 18:03

How to change memory per node for apache spark worker

5 Answers5

Linked

Related