1

I'm working on a spark application and have already done the logic, yet I have little-to-none experience with creating a standalone application.

I have to have a runnable jar, yet when I try to run scala path/to/my/jar I get

   java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession$ 

This is my build.sbt

name := "Spark_Phase2"

version := "0.1"

organization := "bartosz.spark"

scalaVersion := "2.11.8"

libraryDependencies+= "org.apache.spark" %% "spark-core" % "2.3.0"
libraryDependencies+= "org.apache.spark" %% "spark-sql" % "2.3.0"

From what I have seen there is something wrong with dependencies but I could not figure excatly what I have to do to make it runnable.

What puzzles me even more is that sbt run does runs the code fine. So it would be nice if someone could write a step-by-step solution to this :)

And one more thing, I have to take a couple command-line parameters with flags and I have never done it before, does anyone have any good docs/tutorial on this?

Bartors
  • 180
  • 1
  • 16
  • 4
    You'll either need a fat jar or make sure that Spark dependencies are available on the classpath. That being said Spark doesn't support running applications outside `spark-submit` (it usually works, but it is not guaranteed to work). – zero323 Apr 23 '18 at 21:16
  • Any instructions on how to do it? – Bartors Apr 23 '18 at 21:17
  • https://github.com/sbt/sbt-assembly – zero323 Apr 23 '18 at 21:19
  • I have added the sbt-assembly plugin but I did not help much. I get the same error when I try to run the jar-file. Had to add `assemblyMergeStrategy in assembly := { case PathList("META-INF", xs @ _*) => MergeStrategy.discard case x => MergeStrategy.first }` to build.sbt in order to not get deduplication error (solution found here: https://stackoverflow.com/questions/25144484/sbt-assembly-deduplication-found-error ). Any ide what may be wrong? – Bartors Apr 23 '18 at 21:35
  • You need to build a fat jar first using 'sbt assembly' command and use this jar when running java, not the jar you created initially. – Denis Makarenko Apr 23 '18 at 22:10
  • I'm trying now to go with an `assemblyMergeStrategy`yet I get an deduplicate error (when I try to run `sbt assembly`) on org.apache.arrow. I tired to use `case PathList("org","apache",xs @ _*) =>MergeStrategy.last`but that did not help. – Bartors Apr 23 '18 at 22:16
  • As for spark-sumbit, unfortunately the the lecturers want a stand alone application that they may run from command-line with parameters. So I can't use spark-submit. – Bartors Apr 23 '18 at 22:29

0 Answers0