3

I am trying to run spark streaming using Kafka. I am using Scala version 2.11.8 and Spark 2.1.0 build on scala 2.11.8. I understand that the issue is with scala version mismatch but all the dependencies are added with the correct versions(pic attached) and still I am getting this error.

Exception in thread "main" java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
    at kafka.utils.Pool.<init>(Unknown Source)
    at kafka.consumer.FetchRequestAndResponseStatsRegistry$.<init>(Unknown Source)
    at kafka.consumer.FetchRequestAndResponseStatsRegistry$.<clinit>(Unknown Source)
    at kafka.consumer.SimpleConsumer.<init>(Unknown Source)
    at org.apache.spark.streaming.kafka.KafkaCluster.connect(KafkaCluster.scala:59)
    at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:364)
    at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:361)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
    at org.apache.spark.streaming.kafka.KafkaCluster.org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers(KafkaCluster.scala:361)
    at org.apache.spark.streaming.kafka.KafkaCluster.getPartitionMetadata(KafkaCluster.scala:132)
    at org.apache.spark.streaming.kafka.KafkaCluster.getPartitions(KafkaCluster.scala:119)
    at org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffsets(KafkaUtils.scala:211)
    at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:484)
    at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:607)
    at com.forrester.streaming.kafka.App$.main(App.scala:19)
    at com.forrester.streaming.kafka.App.main(App.scala)

Dependecies

Dependencies

<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.11.8</version>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>com.koverse</groupId>
<artifactId>koverse-shaded-deps</artifactId>
<version>${koverse.version}</version>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.1.0</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>


<dependency>
<groupId>org.scalanlp</groupId>
<artifactId>breeze_2.11</artifactId>
<version>0.11.2</version>
</dependency>

<dependency>
<groupId>org.xerial.snappy</groupId>
<artifactId>snappy-java</artifactId>
<version>1.0.5</version>
</dependency>


<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>2.1.0</version>
<scope>test</scope>
</dependency>


<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-8-assembly_2.11</artifactId>
<version>2.1.0</version>
</dependency>

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.1.0</version>
</dependency>
</dependencies>

I did more analysis on different versions:

|Spark build on Scala | Kafka jar                                           |       Result  |
| ------------------- | --------------------------------------------------- | -------------- |
| 2.1.1  on 2.11.8    | spark-streaming-kafka-0-8-assembly_2.11-2.1.1.jar   |   **Working** |
| 2.1.1 on 2.11.8     | spark-streaming-kafka-0-8-assembly_2.10-2.1.1.jar   |   Error as Expected     |
| 2.1.1 on 2.11.8     | spark-streaming-kafka-0-8-assembly_2.10-2.1.0.jar   |   Error as Expected | 
| 2.1.0 on 2.11.8     | spark-streaming-kafka-0-8-assembly_2.10-2.1.0.jar   |   Error as Expected |
| 2.1.0 on 2.11.8     | spark-streaming-kafka-0-8-assembly_2.11-2.1.0.jar   |   **Error : ideally should pass** |
| 2.1.0 on 2.11.8     | spark-streaming-kafka-0-8-assembly_2.11-2.1.1.jar   |   Error as Expected |
| 2.1.0 on 2.11.8     | spark-streaming-kafka-0-8-assembly_2.10-2.1.0.jar   |   Error as Expected |

Error Message ClassNotFoundException: scala.collection.GenTraversableOnce$class

Case-1 is working but the case-5 is failing which should not throw any error

Ömer Erden
  • 7,680
  • 5
  • 36
  • 45
kris
  • 39
  • 2
  • 4
  • what build tool are you using? mvn or sbt? check your "org.scala-lang" dependency version. – geo Sep 29 '17 at 05:21
  • I am using maven version 3.5.0 – kris Sep 29 '17 at 13:32
  • then, how does your POM file look like? Have a look at this link and adjust properly your POM.xml and try again [link1](https://stackoverflow.com/questions/36050341/apache-spark-exception-in-thread-main-java-lang-noclassdeffounderror-scala-co) or post your POM.xml contents here. – geo Sep 29 '17 at 15:09
  • The first line in your error suggest that this is due to a Scala version incompatibility, so you should probably check again "org.scala-lang" version in your POM file – geo Sep 29 '17 at 15:18

0 Answers0