1

I wrote code to get data from "topicTest1" Kafka Queue. I am not able to print data from the consumer. Error occurred and mentioned below,

Below is my code to consume data,

public static void main(String[] args) throws Exception {

        // StreamingExamples.setStreamingLogLevels();
        SparkConf sparkConf = new SparkConf().setAppName("JavaKafkaWordCount").setMaster("local[*]");
        ;
        // Create the context with 2 seconds batch size
        JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, new Duration(100));

        int numThreads = Integer.parseInt("3");
        Map<String, Integer> topicMap = new HashMap<>();
        String[] topics = "topicTest1".split(",");
        for (String topic : topics) {
            topicMap.put(topic, numThreads);
        }

        JavaPairReceiverInputDStream<String, String> messages = KafkaUtils.createStream(jssc, "9.98.171.226:9092", "1",
                topicMap);

        messages.print();
        jssc.start();
        jssc.awaitTermination();
    }

Using following depedencies

<dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming-kafka_2.10</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.11</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming-twitter_2.11</artifactId>
            <version>1.6.1</version>
        </dependency>

Below error I got

 Exception in thread "dispatcher-event-loop-0" java.lang.NoSuchMethodError: scala/Predef$.$conforms()Lscala/Predef$$less$colon$less; (loaded from file:/C:/Users/Administrator/.m2/repository/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar by sun.misc.Launcher$AppClassLoader@4b69b358) called from class org.apache.spark.streaming.scheduler.ReceiverSchedulingPolicy (loaded from file:/C:/Users/Administrator/.m2/repository/org/apache/spark/spark-streaming_2.11/1.6.2/spark-streaming_2.11-1.6.2.jar by sun.misc.Launcher$AppClassLoader@4b69b358).
        at org.apache.spark.streaming.scheduler.ReceiverSchedulingPolicy.scheduleReceivers(ReceiverSchedulingPolicy.scala:138)
        at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$receive$1.applyOrElse(ReceiverTracker.scala:450)
        at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)16/11/14 13:38:00 INFO ForEachDStream: metadataCleanupDelay = -1

        at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
        at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
        at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(Thread.java:785)

Another Error

Exception in thread "JobGenerator" java.lang.NoSuchMethodError: scala/Predef$.$conforms()Lscala/Predef$$less$colon$less; (loaded from file:/C:/Users/Administrator/.m2/repository/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar by sun.misc.Launcher$AppClassLoader@4b69b358) called from class org.apache.spark.streaming.scheduler.ReceivedBlockTracker (loaded from file:/C:/Users/Administrator/.m2/repository/org/apache/spark/spark-streaming_2.11/1.6.2/spark-streaming_2.11-1.6.2.jar by sun.misc.Launcher$AppClassLoader@4b69b358).
    at org.apache.spark.streaming.scheduler.ReceivedBlockTracker.allocateBlocksToBatch(ReceivedBlockTracker.scala:114)
    at org.apache.spark.streaming.scheduler.ReceiverTracker.allocateBlocksToBatch(ReceiverTracker.scala:203)
    at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:247)
    at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:246)
    at scala.util.Try$.apply(Try.scala:161)
    at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:246)
    at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:181)
    at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:87)
    at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:86)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Vimal Dhaduk
  • 994
  • 2
  • 18
  • 43

1 Answers1

2

Make sure that you use the correct versions. Lets say you use following maven dependecy:

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka_2.10</artifactId>
        <version>1.6.1</version>
    </dependency>

So the artifact equals: spark-streaming-kafka_2.10

Now check if you use the correct Kafka version:

cd /KAFKA_HOME/libs

Now find kafka_YOUR-VERSION-sources.jar.

In case you have kafka_2.10-0xxxx-sources.jar you are fine! :) If you use different versions, just change maven dependecies OR download the correct kafka version.

After that check your Spark version. Make sure you use the correct versions

groupId: org.apache.spark artifactId: spark-core_2.xx version: xxx

lidox
  • 1,901
  • 3
  • 21
  • 40
  • Still message is not not printing inside log. Is there any way to see? – Vimal Dhaduk Nov 15 '16 at 04:25
  • can you tell the problem more presise? Which message isn't displayed in the log? you mean your consumer does not get any messages? – lidox Nov 15 '16 at 06:29
  • I have some example code using Kafka and Apache Flink here: https://github.com/lidox/big-data-fun/tree/2eb800725c894521a322ba7f1382491f69074e38/kafka-flink-101/src/main/java/com/artursworld . It should be the same principle – lidox Nov 15 '16 at 06:33
  • Thanks for the info.. code it seems in flink.. We are using Spark. I further describing the issue details.. How can I got access data that is coming from the Kafka.. Here is code available..http://stackoverflow.com/questions/40603254/how-to-get-use-data-that-coming-from-kafka-using-spark – Vimal Dhaduk Nov 15 '16 at 06:43