0

I have submitted the spark streaming job with the yarn cluster mode.

But I am getting the following error.

SparkSubmit Command:

export SPARK_CLASSPATH=/usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar:/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar
spark-submit --master yarn-cluster --keytab /etc/security/keytabs/srvc_egsc_hdpuser.service.keytab --principal srvc_egsc_hdpuser@EAPKDC.HOUSTON.HP.COM --queue sc_streaming --class com.reni.scmplatform.data.producer.DPMain  --executor-memory 5g --driver-memory 8g --conf spark.sql.shuffle.partitions=10 --conf spark.default.parallelism=50 --jars /usr/hdp/current/hbase-client/lib/hbase-common.jar,/usr/hdp/current/hbase-client/lib/hbase-client.jar,/usr/hdp/current/hbase-client/lib/hbase-server.jar,/usr/hdp/current/hbase-client/lib/hbase-protocol.jar,/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar,/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar --files /etc/spark/conf/hbase-site.xml,/etc/spark/conf/hive-site.xml hdfs://EAPROD/EA/supplychain/streaming/logistics/entaly/jars/DataProducer-assembly-1.0.15-SNAPSHOT.jar --platform.framework.hdfs.logging.dir=/EA/supplychain/process/logs/logistics/entaly/dataProducer --platform.framework.logging.level=info --platform.framework.logging.publish=true

Error:

18/03/12 05:14:30 ERROR ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Exception when registering SparkListener
org.apache.spark.SparkException: Exception when registering SparkListener
        at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2154)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:578)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2280)
        at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:140)
        at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:877)
        at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:877)
        at scala.Option.map(Option.scala:145)
        at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:877)
        at com.reni.scmplatform.data.producer.helper.DPStreamEventHandler.start(DPStreamEventHandler.scala:63)
        at com.reni.scmplatform.data.producer.DPMain$.main(DPMain.scala:27)
        at com.reni.scmplatform.data.producer.DPMain.main(DPMain.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:561)
Caused by: java.lang.ClassNotFoundException: com.pepperdata.spark.metrics.PepperdataSparkListener
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.spark.util.Utils$.classForName(Utils.scala:175)
        at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2122)
        at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2119)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2119)
        ... 15 more
18/03/12 05:14:30 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
18/03/12 05:14:30 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
18/03/12 05:14:30 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Exception when registering SparkListener)
alexander.polomodov
  • 5,396
  • 14
  • 39
  • 46
Sankarlal
  • 1
  • 4

1 Answers1

0

You should add the JAR containing the missing class to the job classpath by using the --jars option (see this answer: spark submit add multiple jars in classpath)

Moreover, I use sbt-assembly plugin to take care of these things for you:

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")

Then build with sbt compile assemble and all the jars needed for your application will be included in the job jar sent to Yarn.

Victor
  • 2,450
  • 2
  • 23
  • 54
  • Thanks... but which jar I need to add? I have provided all the required jars already in the spark submit command. – Sankarlal Mar 12 '18 at 07:00
  • Perhaps you are missing the one containing `com.pepperdata.spark.metrics.PepperdataSparkListener`? I can only see hbase-related jars. – Victor Mar 12 '18 at 07:05