17

I am new apache-spark. I have tested some application in spark standalone mode.but I want to run application yarn mode.I am running apache-spark 2.1.0 in windows.Here is My code

c:\spark>spark-submit2 --master yarn --deploy-mode client --executor-cores 4 --jars C:\DependencyJars\spark-streaming-eventhubs_2.11-2.0.3.jar,C:\DependencyJars\scalaj-http_2.11-2.3.0.jar,C:\DependencyJars\config-1.3.1.jar,C:\DependencyJars\commons-lang3-3.3.2.jar --conf spark.driver.userClasspathFirst=true --conf spark.executor.extraClassPath=C:\DependencyJars\commons-lang3-3.3.2.jar --conf spark.executor.userClasspathFirst=true --class "GeoLogConsumerRT" C:\sbtazure\target\scala-2.11\azuregeologproject_2.11-1.0.jar

EXCEPTION: When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment. in spark

so from searching website. I have created a folder name Hadoop_CONF_DIR and place hive site.xml in it and pointed as environment variable, after that i have run spark-submit then I have got

connection refused exception I think i could not configure yarn mode set up properly.Could anyone help me for solving this issue? do I need to install Hadoop and yarn separately?I want to run my application in pseudo distributed mode.Kindly help me to configure yarn mode in windows thanks

Kalyan
  • 1,880
  • 11
  • 35
  • 62

2 Answers2

23

You need to export two variables HADOOP_CONF_DIR and YARN_CONF_DIR to make your configurations file visible to yarn. Use below code in .bashrc file if you are using linux.

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

In windows you need to set environment variable.

Hope this helps!

koiralo
  • 22,594
  • 6
  • 51
  • 72
  • Thanks for the reply. I have seen Linux config before but currently, i am working on windows. I have winutils folder which is my Hadoop home actually. and it is my Hadoop home environmental variable. are you referring that I will set HADOOP_CONF_DIR , YARN_CONF_DIR as environment variable which will point winutils's bin folder ? – Kalyan Jun 08 '17 at 15:34
  • 1
    this one doesn't work for me. I solved this according to [here](https://stackoverflow.com/questions/45703235/when-running-with-master-yarn-either-hadoop-conf-dir-or-yarn-conf-dir-must-be?rq=1), which says modify the `spark_env.sh` file in spark conf folder – Litchy Apr 15 '19 at 03:26
  • What is the minimal change to make this work? DO YOU NEED THAT DIR TO EXIST? – mathtick Mar 12 '20 at 12:17
2

If you are running spark using Yarn then you better need to add this to spark-env.sh:

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
Zoe
  • 27,060
  • 21
  • 118
  • 148
Rahul Bhat
  • 65
  • 7