0

I have experienced "task not serializable" issue when running spark streaming. The reason can be found in this thread.

After I have tried couple methods and fixed the issue, I don't understand why it works.

public class StreamingNotWorking implements Serializable {

private SparkConf sparkConf;
private JavaStreamingContext jssc;

public StreamingNotWorking(parameter) {
    sparkConf = new SparkConf();
    this.jssc = createContext(parameter);
    JavaDStream<String> messages = functionCreateStream(parameter);
    messages.print();
}
public void run() {
    this.jssc.start();
    this.jssc.awaitTermination();
}

public class streamingNotWorkingDriver {

        public static void main(String[] args) {
            Streaming bieventsStreaming = new StreamingNotWorking(parameter);
            bieventsStreaming.run();
        }

Will give the same "Task not serializable" error.

However, if I modify the code to:

public class StreamingWorking implements Serializable {

    private static SparkConf sparkConf;
    private static JavaStreamingContext jssc;

    public void createStream(parameter) {
        sparkConf = new SparkConf();
        this.jssc = createContext(parameter);
        JavaDStream<String> messages = functionCreateStream(parameter);
        messages.print();
        run();
    }
    public void run() {
        this.jssc.start();
        this.jssc.awaitTermination();
    }

    public class streamingWorkingDriver {

            public static void main(String[] args) {
                Streaming bieventsStreaming = new StreamingWorking();
                bieventsStreaming.createStream(parameter);
            }

works perfectly fine.

I know one of the reasons is that sparkConf and jssc need to be static. But I don't understand why.

Could anyone explain the difference?

Community
  • 1
  • 1
DoraShine
  • 659
  • 1
  • 6
  • 11

1 Answers1

1

neither JavaStreamingContext nor SparkConf implements Serializable.

You can't serialize instances of classes without this interface.

Static members wont be serialized.

More information can be found here:

http://docs.oracle.com/javase/7/docs/api/java/io/Serializable.html

Marcinek
  • 2,144
  • 1
  • 19
  • 25