I am trying to iterate over JavaRDD<Tuple2<String, Object>>
and build a JSONArray with the data.
My code:
final JSONArray jA = new JSONArray();
final VoidFunction<Tuple2<String, Object>> func = new VoidFunction<Tuple2<String, Object>>() {
@Override
public void call(Tuple2<String, Object> arg0) throws Exception {
JSONObject obj = new JSONObject();
obj.put(columnName, arg0._1);
obj.put("frequency", (String) arg0._2);
jA.put(obj);
}
};
outputRdd.foreach(func);
I am getting the following serialization error (removed complete trace for readability):
org.apache.spark.SparkException: Task not serializable
Caused by: java.io.NotSerializableException: org.json.JSONArray
Serialization stack:
- object not serializable (class: org.json.JSONArray, value: [])
Any pointers or workarounds?
Thanks :)