Apache Spark: setting executor instances

Question

I run my Spark application on YARN with parameters:

in spark-defaults.conf:

spark.master yarn-client
spark.driver.cores 1
spark.driver.memory 1g
spark.executor.instances 6
spark.executor.memory 1g

in yarn-site.xml:

yarn.nodemanager.resource.memory-mb 10240

All other parameters are set to default.

I have a 6-node cluster and the Spark Client component is installed on each node. Every time I run the application there are only 2 executors and 1 driver visible in the Spark UI. Executors appears on different nodes.

Why can't Spark create more executors? Why are only 2 instead of 6?

I found a very similar question: Apache Spark: setting executor instances does not change the executors, but increasing the memoty-mb parameter didn't help in my case.

How do you run your application ? Can you post your spark-submit command ? Can you also be more specific about your nodes memory ? — eliasah, Oct 26 '16 at 16:37
You have actually solved my problem by your question! I run spark-shell and then I execute scala commands inside. I have added "--num-executors 6" to run-spark-shell and I received 6 executors. But why is it so? Isn't it an optional parameter to spark.executor.instances? I though it's enough to set it in the spark-defaults. — Anna, Oct 26 '16 at 17:02
I figured it out. I have modified the wrong spark-defaults.conf file. I have two users and each user had a different SPARK_HOME directory (I didn't know it before). That's why I couldn't see any effect of my settings for one of the users. So simple, so time-consuming. Thank you for your help. — Anna, Oct 26 '16 at 17:37
You're welcome ! Unfortunately sometimes simple stuff can take much time when you look at the from one sets of eyes ;-) — eliasah, Oct 26 '16 at 17:38
Please answer your own question (https://stackoverflow.com/help/self-answer, http://meta.stackexchange.com/questions/16930/is-it-ok-to-answer-your-own-question-and-accept-it) if you found the solution. — , Oct 26 '16 at 21:16
@eliasah: please don't add [solved] to titles, we don't do that here - we ask instead that OPs tick/accept an answer below. — halfer, Oct 26 '16 at 22:48
@halfer thanks. Forgot about that ! At the same time I have voted for the question to be closed because there were no actual issue :) — eliasah, Oct 27 '16 at 05:14

score 1 · Answer 1 · answered Oct 26 '16 at 22:28

The configuration looks OK at first glance.

Make sure that you have overwritten the proper spark-defaults.conf file.

Execute echo $SPARK_HOME for the current user and verify, if the modified spark-defaults file is in the $SPARK_HOME/conf/ directory. Otherwise Spark cannot see your changes.

I have modified the wrong spark-defaults.conf file. I had two users in my system and each user had a different $SPARK_HOME directory set (I didn't know it before). That's why I couldn't see any effect of my settings for one of the users.

You can run your spark-shell or spark-submit with an argument --num-executors 6 (if you want to have 6 executors). If Spark creates more executors than before, you will be sure, that it's not the memory issue but something with the unreadable configuration.

Apache Spark: setting executor instances

1 Answers1