I've been trying to resolve this issue for a while, but I can't seem to find an answer. I am writing a simple Spark application in Scala which instantiates a NiFi receiver, and although it builds successfully with SBT, I receive the following error when I try to run the application using spark-submit:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/nifi/spark/NiFiReceiver
at <app name>.main(main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.nifi.spark.NiFiReceiver
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 10 more
I have tried a few variations, but this is my build.sbt file:
name := "<application name here>"
version := "1.0"
scalaVersion := "2.10.5"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.2" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.6.2" % "provided"
libraryDependencies += "org.apache.nifi" % "nifi-spark-receiver" % "0.7.0"
libraryDependencies += "org.apache.nifi" % "nifi-site-to-site-client" % "0.7.0"
It should be noted that if I change the two nifi lines to use the Scala equivalents (i.e. the first percent sign in each line is replaced with two percent signs), I actually receive the following error when I run "sbt package":
[error] (*:update) sbt.ResolveException: unresolved dependency: org.apache.nifi#nifi-spark-receiver_2.10;0.7.0: not found
[error] unresolved dependency: org.apache.nifi#nifi-site-to-site-client_2.10;0.7.0: not found
As I mentioned before, with the single percentage signs (and therefore using the Java dependencies) I get no error on build, but I do at runtime.
The relevant part of my application (with certain names removed) is as follows:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import java.time
import java.time._
import org.apache.nifi._
import java.nio.charset._
import org.apache.nifi.spark._
import org.apache.nifi.remote.client._
import org.apache.spark._
import org.apache.nifi.events._
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._
import org.apache.nifi.remote._
import org.apache.nifi.remote.protocol._
import org.apache.spark.streaming.receiver._
import org.apache.spark.storage._
import java.io._
import org.apache.spark.serializer._
object <app name> {
def main(args: Array[String]) {
val nifiUrl = "<nifi url>"
val nifiReceiverConfig = new SiteToSiteClient.Builder()
.url(nifiUrl)
.portName("Data for Spark")
.buildConfig()
val conf = new SparkConf().setAppName("<app name>")
val ssc = new StreamingContext(conf, Seconds(10))
val packetStream = ssc.receiverStream(new NiFiReceiver(nifiReceiverConfig, StorageLevel.MEMORY_ONLY))
The error is referring to the last line here, where the NifiReceiver is instantiated - it can't seem to find that class name anywhere.
I have do far tried a number of approaches including the following (separately): 1) Finding the jar files for nifi-spark-receiver and nifi-site-to-site-client and adding them into a lib directory in my project 2) Following this post https://community.hortonworks.com/articles/12708/nifi-feeding-data-to-spark-streaming.html. I ended up making a copy of spark-default.conf.template in my Spark conf directory, renaming it to spark-defaults.conf and adding the two lines in step 1 at that link into the file (substituting for the actual names and locations of the files in question). I then ensured that I had all the necessary import statements that were used in the two code examples on that page 3) Created a project directory at the root of my application directory, and then created a file called assembly.sbt inside it. I then added the following line inside (as referenced here: https://github.com/sbt/sbt-assembly):
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
After that I then ran "sbt assembly" instead of "sbt package" to have the application create an uber jar, but this then failed as well with the same error as when running "sbt package" with the Scala dependencies in the build.sbt file:
[error] (*:update) sbt.ResolveException: unresolved dependency: org.apache.nifi#nifi-spark-receiver_2.10;0.7.0: not found
[error] unresolved dependency: org.apache.nifi#nifi-site-to-site-client_2.10;0.7.0: not found
Please let me know if any further information is required. Thanks in advance for any help.