Spark installation - Error: Could not find or load main class org.apache.spark.launcher.Main

Question

After spark installation 2.3 and setting the following env variables in .bashrc (using gitbash)

HADOOP_HOME
SPARK_HOME
PYSPARK_PYTHON
JDK_HOME

executing $SPARK_HOME/bin/spark-submit is displaying the following error.

Error: Could not find or load main class org.apache.spark.launcher.Main

I did some research checking in stackoverflow and other sites, but could not figure out the problem.

Execution environment

Windows 10 Enterprise
Spark version - 2.3
Python version - 3.6.4

Can you please provide some pointers?

poloC · Answer 1 · 2018-05-30T08:09:40.070

I had that error message. It probably may have several root causes but this how I investigated and solved the problem (on linux):

instead of launching spark-submit, try using bash -x spark-submit to see which line fails.
do that process several times ( since spark-submit calls nested scripts ) until you find the underlying process called : in my case something like :

/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp '/opt/spark-2.2.0-bin-hadoop2.7/conf/:/opt/spark-2.2.0-bin-hadoop2.7/jars/*' -Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name 'Spark shell' spark-shell

So, spark-submit launches a java process and can't find the org.apache.spark.launcher.Main class using the files in /opt/spark-2.2.0-bin-hadoop2.7/jars/* (see the -cp option above). I did an ls in this jars folder and counted 4 files instead of the whole spark distrib (~200 files). It was probably a problem during the installation process. So I reinstalled spark, checked the jar folder and it worked like a charm.

So, you should:

check the java command (cp option)
check your jars folder ( does it contain ths at least all the spark-*.jar ?)

Hope it helps.

score 1 · Answer 2 · edited Nov 14 '21 at 07:23

Verify below steps :

spark-launcher_*.jar is present at $SPARK_HOME/jars folder?
explode spark-launcher_*.jar to verify if you have Main.class or not.

If above is true then you may be running spark-submit on windows OS using cygwin terminal.

Try using spark-submit.cmd instead also cygwin parses the drives like /c/ and this will not work in windows so its important to provide the absolute path for the env variables by qualifying it with 'C:/' and not '/c/'.

score 0 · Answer 3 · answered Jun 19 '21 at 11:37

Check Spark home directory contained all folder and files(xml, jars etc.) otherwise install Spark.
Check your JAVA_HOME and SPARK_HOME environment variable are set in your .bashrc file, try setting the below:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/

export SPARK_HOME=/home/ubuntu-username/spark-2.4.8-bin-hadoop2.6/

Or wherever your spark is downloaded to

export SPARK_HOME=/home/Downloads/spark-2.4.8-bin-hadoop2.6/

once done, save your .bash and run bash command on terminal or restart the shell and try spark-shell

Spark installation - Error: Could not find or load main class org.apache.spark.launcher.Main

3 Answers3

Linked

Related