This question comes from my other question. There in comments, I observed a behaviour which I want to ask about. Does sparkcontext jobs have to run in main method only? Like below code doesn't work, the spark job is created but the executor keeps runnning and never finishes.
import org.apache.spark.SparkContext
object App38 {
val sc = new SparkContext("local[1]", "SimpleProg")
val nums = sc.parallelize(List(1, 2, 3, 4))
println(nums.reduce((a, b) => a - b))
def main(args: Array[String]): Unit = {
// println(nums.reduce((a, b) => a - b))
}
}
while if I put only the reduce call(below code) in main method it runs fine
import org.apache.spark.SparkContext
import org.apache.spark.sql.SparkSession
object App38 {
val sc = new SparkContext("local[1]", "SimpleProg")
val nums = sc.parallelize(List(1, 2, 3, 4))
// println(nums.reduce((a, b) => a - b))
def main(args: Array[String]): Unit = {
println(nums.reduce((a, b) => a - b))
}
}
What is this behaviour? I'm new to spark, so any help is appreciated.