12

I follow the Scala tutorial on https://spark.apache.org/docs/2.1.0/quick-start.html

My scala file

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val logFile = "/data/README.md" // Should be some file on your system
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println(s"Lines with a: $numAs, Lines with b: $numBs")
    sc.stop()
  }
}

and build.sbt

name := "Simple Project"

version := "1.0"

scalaVersion := "2.12.4"

libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "2.2.0" 

I ran sbt package sucessfully (already delete everything except scala source code and build.sbt then run sbt package again)

[info] Loading project definition from /home/cpu11453local/workspace/testspark_scala/project
[info] Loading settings from build.sbt ...
[info] Set current project to Simple Project (in build file:/home/my_name/workspace/testspark_scala/)
[info] Packaging /home/my_name/workspace/testspark_scala/target/scala-2.12/simple-project_2.12-1.0.jar ...
[info] Done packaging.
[success] Total time: 1 s, completed Nov 8, 2017 12:15:24 PM

However, when I run spark submit

$SPARK_HOME/bin/spark-submit --class "SimpleApp" --master local[4] simple-project_2.12-1.0.jar 

I got error

java.lang.NoClassDefFoundError: scala/runtime/LambdaDeserialize

Full spark-submit output on gist

Haha TTpro
  • 5,137
  • 6
  • 45
  • 71
  • 2
    You are using incompatible Scala and Spark versions, see e.g. https://stackoverflow.com/questions/43883325/scala-spark-version-compatibility – Alexey Romanov Nov 08 '17 at 06:42
  • 3
    Change `scalaVersion := "2.12.4"` to `scalaVersion := "2.11.8"` and `"spark-core_2.10"` to `"spark-core_2.11"`. – Jacek Laskowski Nov 08 '17 at 06:46
  • BTW, if you've just started working with Spark, use http://spark.apache.org/docs/latest/sql-programming-guide.html instead. Please. – Jacek Laskowski Nov 08 '17 at 06:47

5 Answers5

13

as @Alexey said, change Scala version to 2.11 fixed the problem.

build.sbt

name := "Simple Project"

version := "1.0"

scalaVersion := "2.11.11"

libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.2.0" 

Note that Scala version MUST MATCH with Spark. Look at the artifactId, spark-core_2.11 mean it was compatible with scala 2.11 (No backward or forward compatible)

Haha TTpro
  • 5,137
  • 6
  • 45
  • 71
1

Following is the build.sbt entries for the latest Spark 2.4.1 release sample shown in Spark/Scala online guide:

name := "SimpleApp" 
version := "1.0"
scalaVersion := "2.12.8"
libraryDependencies += "org.apache.spark"  %% "spark-sql" % "2.4.1"

Though everything works fine inside IntelliJ IDE, the application still fails with the following exception,

Caused by: java.lang.NoClassDefFoundError: scala/runtime/LambdaDeserialize

after creating the package with 'sbt package' command and running the spark-submit from the command line as the following;

spark-submit -v --class SimpleApp --master local[*] target\scala-2.12\simpleapp_2.12-1.0.jar
diopek
  • 11
  • 1
  • 1
    if it is a question, please create as a new question – Haha TTpro Apr 24 '19 at 03:43
  • Yes, this is a new problem with latest Spark version 2.4.1 using Scala 2.12.8 – diopek Apr 26 '19 at 03:39
  • This issue has been resolved after new Spark 2.4.2 release today – diopek Apr 26 '19 at 04:27
  • 1
    @diopek, no, it isn't. I'm experiencing the same issue trying to `spark-submit` a sample app built with Spark 2.4.3 and Scala 2.12.8. – satorg May 25 '19 at 19:54
  • 1
    @satorg The Spark 2.4.3 pre-built distribution went BACK to being built with Scala 2.11.12. Here's the complete list: Spark 1.x built with Scala 2.10. Spark 2.0.0 - Spark 2.4.1 built with Scala 2.11. Spark 2.4.2 is the ONLY version built with Scala 2.12 at the moment. Spark 2.4.3 is back to Scala 2.11. – sparkour May 30 '19 at 15:49
1

I have similar issue while following the instructions provided at https://spark.apache.org/docs/2.4.3/quick-start.html

My setup details: Spark version: 2.4.3 Scala version: 2.12.8

However, when i changed my sbt file to below configuration everything worked fine.(both compilation and running the application jar)

name := "Simple Project"

version := "1.0"

scalaVersion := "2.11.11"

libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"

It looks like spark 2.4.3 is compatible with 2.11.11 Scala version only. While compiling the sample project sbt has downloaded the Scala 2.11 library from "https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.11.11"

Naveen
  • 11
  • 2
1

There is definitely some confusion regarding the Scala version to be used for Spark 2.4.3. As of today (Nov 25, 2019) the doc home page for spark 2.4.3 states:

Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 2.4.3 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x).

Note that support for Java 7, Python 2.6 and old Hadoop versions before 2.6.5 were removed as of Spark 2.2.0. Support for Scala 2.10 was removed as of 2.3.0. Support for Scala 2.11 is deprecated as of Spark 2.4.1 and will be removed in Spark 3.0.

Accordingly, the Scala version is supposed to be 2.12.

Community
  • 1
  • 1
Avinash Ganta
  • 153
  • 2
  • 2
0

I used sdkman to install scala and spark

I solved this[3] by:

  • finding the versions I installed[1]
  • updating build.sbt[2]

[1]

spark/2.4.7/hello_world via ☕ v1.8.0 via  vsuch on ☁️  (us-west-2) took 11s
❯ scala -version
cat: /Users/lgeoff/.sdkman/candidates/java/current/release: No such file or directory
Scala code runner version 2.11.12 -- Copyright 2002-2017, LAMP/EPFL

spark/2.4.7/hello_world via ☕ v1.8.0 via  vsuch on ☁️  (us-west-2)
❯

spark/2.4.7/hello_world via ☕ v1.8.0 via  vsuch on ☁️  (us-west-2)
❯ sdk list spark
================================================================================
Available Spark Versions
================================================================================
     3.2.0               2.3.2
     3.1.2               2.3.1
     3.1.1               2.3.0
     3.0.2               2.2.1
     3.0.1               2.2.0
     3.0.0               2.1.3
 > * 2.4.7               2.1.2
     2.4.6               2.1.1
     2.4.5               2.0.2
     2.4.4               1.6.3
     2.4.3               1.5.2
     2.4.2               1.4.1
     2.4.1
     2.4.0
     2.3.3

================================================================================
+ - local version
* - installed
> - currently in use
================================================================================
spark/2.4.7/hello_world via ☕ v1.8.0 via  vsuch on ☁️  (us-west-2)
❯

[2]

spark/2.4.7/hello_world via ☕ v1.8.0 via  vsuch on ☁️  (us-west-2) took 8s
❯ cat build.sbt
name := "Simple Project"

version := "1.0"

scalaVersion := "2.11.12"

libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.7"

spark/2.4.7/hello_world via ☕ v1.8.0 via  vsuch on ☁️  (us-west-2)
❯

[3]

❯ spark-submit \
  --class "SimpleApp" \
  --master local[4] \
  target/scala-2.11/simple-project_2.11-1.0.jar

...

Lines with a: 61, Lines with b: 30

...

❯
Geoff Langenderfer
  • 746
  • 1
  • 9
  • 21