0

I am new to Spark streaming. I followed the tutorial from this link : https://spark.apache.org/docs/latest/streaming-programming-guide.html

When I ran the code, I could see the line was being processed, but I could not see output with timestamp.

I only could see this log:

14/10/22 15:24:17 INFO scheduler.ReceiverTracker: Stream 0 received 0 blocks
14/10/22 15:24:17 INFO scheduler.JobScheduler: Added jobs for time 1414005857000 ms
.....

Also I was trying to save last DStream with forEachRDD function call, the data was not being stored. If anyone can help me with this, would be a great help..

sms_1190
  • 1,267
  • 2
  • 12
  • 24

3 Answers3

4

I met the same problem, here is how I solved:

change

val conf = new SparkConf().setMaster("local")

to

val conf = new SparkConf().setMaster("local[*]")

It's a mistake to setMaster("local"), which will not calculate actually.

Hope this is the problem you meet.

valaxy
  • 79
  • 5
0

The print is working as evidenced by the ..... separator, only there's nothing to print: the DStream is empty. The log provided actually shows that: Stream 0 received 0 blocks.

Make sure you're sending data correctly to your Receiver.

maasg
  • 37,100
  • 11
  • 88
  • 115
  • I was getting data but after 20 seconds. Do you know how to do streaming with Kafka. Now I stuck with kafka. I am getting the data but it is not going till print() function of RDD. – sms_1190 Oct 22 '14 at 21:07
  • Hard to say. I guess you need to post a new question with the code you're using. – maasg Oct 22 '14 at 21:09
-1
val conf = new SparkConf().setMaster("local[*]") works

local[*]: '*' means create the worker thread as the same number as the kernel number of CPU
if using "local", no worker is created, why default is not 1, isn't it a issue? refer to. What does setMaster `local[*]` mean in spark?

barbsan
  • 3,418
  • 11
  • 21
  • 28