2

My build.sbt looks like this:

libraryDependencies ++= Seq(
  "org.apache.hadoop" % "hadoop-aws" % sparkVersion % Provided,
  "org.apache.spark" %% "spark-core" % sparkVersion % Provided,
  "org.apache.spark" %% "spark-sql" % sparkVersion % Provided,
  "org.scala-lang" % "scala-library" % scalaVersion.value % Provided
)

While running my application from Intellij, I get NoClassDefFoundError exceptions because it cannot find spark libraries. So when using Intellij, I need:

libraryDependencies ++= Seq(
  "org.apache.hadoop" % "hadoop-aws" % sparkVersion,
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.scala-lang" % "scala-library" % scalaVersion.value % Provided
)

But that causes the final flat jar to be very big.

How to have different list of Provided dependencies depending whether using Intellij or not?

Igor Gatis
  • 4,648
  • 10
  • 43
  • 66
  • https://stackoverflow.com/a/21803413/1020190 may help. – Taha Oct 22 '21 at 15:53
  • What do you mean by "running your app in IntelliJ"? Unit test? – Gaël J Oct 22 '21 at 17:02
  • In my use case, this is a Spark job meant to run under EMR which already provides these dependencies in the classpath. During development time, I use Intellij and those dependencies are missing. I'd like to have single build.sbt which address this situation. – Igor Gatis Oct 24 '21 at 12:07

1 Answers1

3

The solution for my use case was to check IntelliJ Add dependencies with "provided" scope to classpath option. Found it here:
How to work efficiently with SBT, Spark and "provided" dependencies?

Since UI has changed, here are the screenshots for Intellij 2021.2.3:

After selecting it, you should see:

Igor Gatis
  • 4,648
  • 10
  • 43
  • 66